BinaryReader
requires strings to be encoded in a specific format that is the same format BinaryWriter
uses when writing them. This encoding method prefixes the string with its length, which is then encoded as an integer seven bits at a time.
According to the documentation, this method reads a string from the current stream where the string's length is indicated by an integer value encoded seven bits at a time.
Essentially, the length of the string is stored just before the actual string itself and is encoded using seven bits at a time per integer. Additional information about this process can be found in the BinaryWriter.Write7BitEncodedInt documentation:
This means that the integer value is written out in seven-bit chunks, starting with the least significant bits. Each byte contains a high bit indicating whether more bytes are needed to represent the full integer.
If the value fits within seven bits, it only takes up one byte. If not, the high bit is set on the first byte, and the remaining bits are shifted to the following byte until the entire integer has been represented.
This approach utilizes variable-length encoding unlike the standard 4-byte usage for Int32 values. Shorter strings can require less than 4 bytes (e.g., strings under 128 bytes may only need 1 byte).
In JavaScript, you can replicate this logic by reading one byte at a time. The lower 7 bits convey part of the length information while the highest bit indicates if another byte follows for additional length data or starts the actual string.
To decode the byte array into a string of specified encoding, use the `TextDecoder
` function. Below is a TypeScript implementation of this process using a buffer (Uint8Array
), buffer offset, and optionally defining the encoding (default being UTF-8):
// Implementation of BinaryReader class in TypeScript
class BinaryReader {
getString(buffer: Uint8Array, offset: number, encoding: string = "utf-8") {
let length = 0; // Initialize the length of the subsequent string
let cursor = 0;
let nextByte: number;
do {
// Retrieve the next byte
nextByte = buffer[offset + cursor];
// Extract 7 bits from the current byte and shift them based on their position
// If it's the first byte, no shifting occurs. For subsequent bytes, shift by multiples of 7
// Combine the extracted bits with the length using bitwise OR operation
length = length | ((nextByte & 0x7F) << (cursor * 7));
cursor++;
} while (nextByte >= 0x80); // Continue while the most significant bit is 1
// Fetch a slice of the calculated length
let sliceWithString = buffer.slice(offset + cursor, offset + cursor + length);
let decoder = new TextDecoder(encoding);
return decoder.decode(sliceWithString);
}
}
It's advisable to include various sanity checks in the above code if it will be used in a production environment to avoid reading unnecessary bytes during length interpretation or ensuring the calculated length falls within buffer boundaries.
A brief test using the binary representation of the string "TEST STRING," as written by BinaryWriter.Write(string)
in C#:
// Test example
let buffer = new Uint8Array([12, 84, 69, 83, 84, 32, 83, 84, 82, 73, 78, 71, 33]);
let reader = new BinaryReader();
console.log(reader.getString(buffer, 0, "utf-8"));
// Output should be "TEST STRING"
Update: Your comment mentioned that your data represents string length using 4 bytes (e.g., [0, 0, 0, 29] for a length of 29). In such cases, the data wasn't originally written using BinaryWriter
, so using BinaryReader
might not be applicable to read it. However, a solution for handling such scenarios is provided below:
// Updated implementation for handling 4-byte length representation
class BinaryReader {
getString(buffer: Uint8Array, offset: number, encoding: string = "utf-8") {
// Create a view over the first 4 bytes starting at the given offset
let view = new DataView(buffer.buffer, offset, 4);
// Read these 4 bytes as a signed int32 (big-endian format)
let length = view.getInt32(0);
// Get a slice of the obtained length
let sliceWithString = buffer.slice(offset + 4, offset + 4 + length);
let decoder = new TextDecoder(encoding);
return decoder.decode(sliceWithString);
}
}