3
votes

In debug view:

Here is the code which encodes into messy string...

((S2CEnterCollection)objS2c).toByteString().toStringUtf8();

Output:

    ���"default(
    ���"default(
    ���"default(
    ���"default(
    ���"default(
    ����"default(
    ����"default(
    �����"default(

Here is the code which has the right string:

((S2CEnterCollection)objS2c).toString()

The original string was:

    cardList {
      cardId: 100001
      liked: 100
      number: 10
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100002
      liked: 123
      number: 10
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100003
      liked: 543
      number: 10
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100004
      liked: 766
      number: 10
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100005
      liked: 78
      number: 10
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100006
      liked: 89
      number: 123
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100007
      liked: 199
      number: 567
      finder: "default"
      rank: 1
    }
    cardList {
      cardId: 100008
      liked: 90909
      number: 232
      finder: "default"
      rank: 1
    }

So, does anyone know how it works?

3
Hi Ryan, you might try adding in the code you used that generated this. Also, what character encoding are you using? - jmort253
hi, @jmort253, i was using utf-8 encoding which is the default. And I tried to code like : new String(((S2CEnterCollection)objS2c).toString().getBytes(Charset.forName("utf-8"))); which worked well and gave the expected result.. But as u can see,this way didn't include any protocol buffers framework and it seemed like i got straight and back of data transferring which actually just return the string of the object as ((S2CEnterCollection)objS2c).toString()... - Ryan Zhu
sorry for a mistake,actually i didn't use Chinese, just english... but still got messy data... - Ryan Zhu
Then you should edit your question title to edit that out. On Stack Overflow, nothing you post is immutable. - jmort253

3 Answers

3
votes

protobuf data is binary, and is not encoded text. You cannot run it through an encoding like UTF-8 and expect to get a string (or expect it to still be valid). The only way to convert protobuf data to a string is to run it through a base-N encode for some N, typically 64 (because it is well-supported on most platforms).

2
votes

That messy string is likely absolutely correct. The problem is: you're assuming it's a human readable string, and it's not. toByteString(), and I quote:

Serializes the message to a ByteString and returns it. This is just a trivial wrapper around writeTo(CodedOutputStream).

https://developers.google.com/protocol-buffers/docs/reference/java/index - look for MessageLite.

It's the sort of format that you might use to transmit across a network, or something you might store in a file with millions of records. It's not meant to be human readable - it's meant to be a relatively small, machine readable representation. So it does things like use tag identifiers (small numbers) rather than field names, variable length encoding, and various other tricks to minimize size at the expense of readability.

https://developers.google.com/protocol-buffers/docs/encoding

1
votes

I prefer to use Google's own com.google.protobuf.TextFormat class which constructs a human readable representation of the Protobuf object's contents with it's "print" methods. In the example below, PayloadContent can be any Message:

PayloadContent pc = PayloadContent.newBuilder().setContent........build();
String text = TextFormat.shortDebugString(pc);

If you want however to see the "Byte" format, then surely convert the ByteString representation to Base64 - but this is not much use for a human to read :)