Let's say we want to write a filter for the following SID, here represented as a byte[]:

Code:
0x01, 0x05, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x15, 0x00, 0x00, 0x00,
0xe9, 0x67, 0xbb, 0x98, 0xd6, 0xb7, 0xd7, 0xbf, 0x82, 0x05, 0x1e, 0x6c,
0x28, 0x06, 0x00, 0x00
A Java String representation using Unicode escaping is:

Code:
"\u0001\u0005\u0000\u0000\u0000\u0000\u0000\u0005\u0015\u0000\u0000\u0000\u00e9\u0067\u00bb\u0098\u00d6\u00b7\u00d7\u00bf\u0082\u0005\u001e\u006c\u0028\u0006\u0000\u0000"
If you run this through LdapEncoder.filterEncode, you'll get:

Code:
\00\00\00\00\00\00\00\00ég»˜Ö·×¿‚l\28\00\00
Now, ignoring the fact that this is pretty much unreadable, the question is: does this represent the binary data in the server? Do you get a hit when you search using a filter like:

Code:
"(objectSID=\u0001\u0005\u0000\u0000\u0000\u0000\u0000\u0005\u0015\u0000\u0000\u0000\u00e9\u0067\u00bb\u0098\u00d6\u00b7\u00d7\u00bf\u0082\u0005\u001e\u006c\u0028\u0006\u0000\u0000)"
If the answer is no, we need to find out where in the escaping chain the data gets corrupted.

If the answer is yes, then we might benefit---mainly from a readability standpoint---from escaping more control characters than just NUL. Here is the same output where I've escaped all characters up to 0x1f:

Code:
\01\05\00\00\00\00\00\05\15\00\00\00ég»˜Ö·×¿‚\05\1el\28\06\00\00
It's slightly more readable, but still impossible to correlate to the original byte[]. So we probably need:

  1. a Filter that takes a byte[] and constructs a Unicode-escaped String
  2. a way of printing a Filter containing binary data in a readable way