OfficeOpenXml: Agile Encryption


Recently danyelljc contacted me to point out that the code associated with an earlier article did not include an example of using the “Agile Encryption” (see section 2.3.4.10 – 2.3.4.15 in MS-OFFCRYPTO) which is used as the default encryption in Office 2010. This post corrects my omission.

Update 2011-05-29: The example decryption code now includes routines which implement password verification and data integrity tests.

Update 2011-06-07: Included an option to use a SHA1 hash that uses the crypto api directly instead of using the managed code implementation of the algorithm. It’s about 2x faster. Also included public properties to make testing the password and data integrity option. In the earlier versions about 5/7th of the time is spent in password verification and data integrity testing.

Update 2012-04-04: @Webie has created a C implementation of the password validation routines for use on Linux using OpenSSL and libgsf (to read OLE Storage files). At this time, support is provided for 2007 and 2010 Agile encryption. You will find his implementation here: https://github.com/magnumripper/magnum-jumbo

Back in 2008 I wrote an article about how to decrypt Office 2007 files and especially to provide some conformance data. There is no conformance data in MS-OFFCRYPTO (the reference document which describes the algorithms to encrypt and decrypt Office documents of all types). The lack of this data makes it difficult for developers to develop code to create their own encryption routines.

Implementing Agile decryption is straight forward. Here’s a class which provides an example implementation. Generally the process of Agile decryption is the same as the “standard” encryption/decryption process (which the earlier post covered) but with some significant differences:

  • The “EncryptionInfo” which provides key decryption information is an Xml document rather than binary data
  • The package itself is encrypted in 4K blocks.

Warning: The code includes an entry point method (DecryptToArray()) which will not work for you. It relies on one of our libraries to open and access the file type used by Office to store encrypted documents. However it shows the steps you need to take to extract the EncryptionInfo and EncryptedPackage stream content so you are in a position begin the process of decryption.

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts

Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

Nice work on this one – I’ve used it to do password verification. The hash operation is horribly slow though due to the spin count of 100000 (default for 2010 docs). I used a high res timer on the spin loop in GenerateAgileEncryptionKey and got 390ms. By putting the .ComputeHash call directly into this method, I managed to shave off 70ms bringing the time down to 320ms. What’s more is that GenerateAgileEncryptionKey is called twice for a single password verification and that time coupled with the overhead of the other ops (decrypt and one-time hashing) brings the time up to about 800ms/core on my machine.

When I started experimenting with this using your initial project, I thought I’d get substantially better performance on that type of implementation as opposed to using the Excel API. However, using that API I get around 10 attempts/sec/core – 12x faster than the .net implementation! I expect a good chunk of the time is going to marshaling the data but to get decent performance you’d need to write an unmanaged implementation of at least the GenerateAgileEncryptionKey method.

According to http://blogs.msdn.com/b/david_leblanc/archive/2008/12/04/new-improved-office-crypto.aspx, Elcomsoft achieved 5000 iterations/sec on the Standard version. If memory serves, that version only used one “spun” hash operation with a spin count of 50000. That would bring the relative number down to 1500/sec which still seems very fast by comparison.

Thanks for the feedback. Fortunately password verification is not needed to decrypt a document but then if you need password verification…

It’s interesting that you were able to knock 70ms off just by moving the ComputeHash call. Performance is not a consideration in our environment so thought was given to it. Did you try replacing the .Concat() call with explicit byte array manipulation? The Linq call maybe inefficient because it is likely to iterate over the underlying array rather than efficiently copy it.

Yes, replacing the .NET implementation of the SHA1 hash with a direct call to CryptoAPI functions is likely to improve things. Using the CryptoAPI directly is awkward but not difficult to do. If you’d like an example, of replacing the .NET hash function with the CryptoAPI one let me know.

I put timers around each line in the inner loop. The .ComputeHash call accounts for 90% of the time so array manipulation is not the issue.

I would definitely like to try out an unmanaged hashing implementation. Seems wrong that the Excel API is faster than semi-optimised .net code!

I’ve updated the code to optionally use a managed SHA1 hash or an SHA1 implementation which uses the crypto api directly. Use the link in the article to access the new version. There is a significant improvement though not nearly as much as I’d have anticipated. Here are some stats from tests run (using the Diagnostics.Stopwatch) on my laptop.

Using managed code:

Verifying the password 10x: 2330 ms
Verifying, checking integrity and decrypting: 2690 – 3540 ms
Decrypting only 903 – 1090 ms

Using the crypto api directly:

Verifying the password 10x: 1200 ms
Verifying, checking integrity and decrypting: 1820 – 2140 ms
Decrypting only 526 – 695 ms

So roughly twice as fast using the crypto api but not 12x.

Given all this is array manipulation it’s hard to see why it’s SO much slower than Excel. Must be marshalling the byte arrays as you suggested. The acid test would be a C++ implementation. Definitely for another day.

As you know I was concerned about the use of .Concat() so I’ve tried using a static hash buffer in the GenerateAgileEncryptionKey method instead but it makes only about 10ms improvement per password verification (~100ns/iteration of the spin count).

Thanks for this, I’m guessing you had this prebuilt and on-hand! 2x is a great improvement but as you say, doesn’t seem to be close to Excel’s speed which really is odd. It just doesn’t lose that much time to array manipulation so I’m not sure why it’s still 5x slower. Will work on this code and see what I get for performance.

Which license is this code? The header says it is Apache 2.0, but then below there is a blurb about LGPL 2.0, with a link pointing to CreativeCommons ShareAlike.

Hi Bills I am writing a program for Agile Encryption
I am not understand where is problem.
(1) Here salt value is greater than 16 if I use whole salt then what is use of salt size.
(2) H(n)=[iterator + H(n-1)]
(3) H(final)=H{H(n)+Block_key]
What is block key in this step.In standard encryption this was 4 byte zero.

The only purpose of the salt size value is to let the decrypter know how big an array to create to hold the salt value.

In the method DecryptAgile() the salt size value is not used. But it should be used to validate the size of the salt value itself.

It is conceivable the mechanism used to generate the salt value recorded in the Xml stream contains extra superfluous bytes in which case the salt size value might be used indicate how many of those bytes to used or to indicate an error.

However I know my encryption algorithm does not have such side effects and I know the Office algorithms do not have such side effects.

Hi Bills i am follow all your steps but not understand where is problem.

I am using H(zero)=H(Salt + Password)=H(initialHashInput)
H(zero)={0x77,0xad,0x3d,0x1b,0xbb,0xf7,0x28,0xdd,0xdc,0xcf,0xe1,0x82,0xbb,0xcc,0x91,0x8a,0xff,0x6a,0xa2,0x9f}// plz verify H(zero)result
Step 2-> H(n)=H(iterator + H(n-1)) // SpinCount =100000 //
H(n)={0x33,0xd9,0x17,0x8c,0x66,0x99,0x1c,0xe8,0x16,0x9d,0x59,0x00,
0xb7,0xd5,0x72,0xdf,0x9c,0x1c,0x08,0xa0} // plz verify H(n)result
static byte GHashInput2010[20] = H(n);
Step 3 -> H(final)=H[H(n)+ block_key]
What is block_key in this step I am not understand
In standard encryption this was 4 byte zero.

memcpy(H(n) + DIGEST_SHA1, m_verifierHashInputblockKey2010, 8);
// DIGEST_SHA1=20 /////
updateSHA1( aDigest, H(n), DIGEST_SHA1+ 8);
getSHA1( aDigest, H(n), DIGEST_SHA1);
Now H(n)={0xbd,0x5d,0x97,0x66,0x72,0x82,0x84,0x78,0x9c,0xab,0xbe,0x33,0xf4,0x6f,0x30,0x82,0x2a,0x2b,0x0e,0xde,0xfe,0xa7,0xd2,0x76,0x3b,0x4b,0x9e,0x79}// plz verify result
DeriveKey(H(n), DIGEST_SHA1, &m_maKey.front(), m_KeySize/8 );
AES_KEY key;
sal_uInt8 pnVerifier[24] = { 0 };
sal_uInt8 pnVerifierHash[44] = { 0 };
// pnVerifier[16] = { 0 };//for standard encryption
// pnVerifierHash[32] = { 0 };// for standard encryption
AES_decrypt_key(&m_maKey.front(), m_KeySize, &key);
AES_ecb_encrypt(m_verifierHashInput2010, pnVerifier, &key, AES_DECRYPT);
Now pnVerifier={0x58,0x8a,0xf1,0x89,0xc6,0xbc,0xe8,0x69,0x3a,0x21,0x16,0x77,0x6c,0xd8,0xcd,0x1d}// plz verify result
memcpy(GHashInput2010+ DIGEST_SHA1, m_verifierHashValueblockKey2010, 8);
H(GHashInput2010)={0x71,0xfa,0x3f,0xe5,0x4a,0x1e,0x38,0xa9,0x68,0x96,
0x72,0xcc,0xfa,0x77,0x90,0xf5,0x7b,0xe1,0x70,0x1d,0xd7,0xaa,0x0f,0x6d,0x30,
0x61,0x34,0x4e}// plz verify the result
lclDeriveKey(H(GHashInput2010), DIGEST_SHA1, &m_maKey.front(), m_KeySize/8 );
AES_set_decrypt_key(&m_maKey.front(), m_KeySize, &key);
AES_ecb_encrypt(m_verifierHashValue2010, pnVerifierHash, &key, AES_DECRYPT);
Now pnVerifierHash={0x2d,0xcc,0x9e,0x5f,0xe7,0xd5,0x83,0xc4,0x9f,0x8d,
0xe8,0xa4,0x74,0xf2,0xa5,0x62}// plz verify result
pnSha1Hash=H(pnVerifier);
if (!memcmp( pnSha1Hash, pnVerifierHash, 16 )) {
return pPwd;
}
return NULL;

In office 2007 i am successfully verify the password but in office 2010 i am not understand which step is wrong to generate the key.

Thanks.

keyEncryptor
saltValue=”2bJlhBIvCzBvQkxV2DelkA==”
encryptedVerifierHashInput=”NiJkrrpC4e4RfdyYSQ9PRQ==”
encryptedVerifierHashValue=”6lZhx9DxTMRDClA1lx27fGywU6kEZUGhabGeH5cNdVI=”
encryptedKeyValue=”YpxqgXOBZY0QENQ18uzaig==”
I am using <p:encrypted saltValue="2bJlhBIvCzBvQkxV2DelkA==" for salt
According to [MS-OFFCRYPTO].pdf section 2.3.4.10 – 2.3.4.15 I write a code
But I am not understand where is problem.
pPwd=a={0x61,0x00};
int nLen = (int)(wcslen(pPwd) * sizeof(wchar_t));
sal_uInt32 nBufferSize = 24 + nLen;
static byte m_salt2010[24]={0x32,0x62,0x4A,0x6C,0x68,0x42,0x49,0x76,0x43,0x7A,0x42,0x76,0x51,0x6B,0x78,0x56,0x32,0x44,0x65,0x6C,0x6B,0x41,0x3D,0x3D};
static byte m_verifierHashInput2010[24]={0x4E,0x69,0x4A,0x6B,0x72,0x72,0x70,0x43,0x34,0x65,0x34,0x52,0x66,0x64,0x79,0x59,0x53,0x51,0x39,0x50,0x52,0x51,0x3D,0x3D};
static byte m_verifierHashValue2010[44]={0x36,0x6C,0x5A,0x68,0x78,0x39,0x44,0x78,0x54,0x4D,0x52,0x44,0x43,0x6C,0x41,0x31,0x6C,0x78,0x32,0x37,0x66,0x47,0x79,0x77,0x55,0x36,0x6B,0x45,0x5A,0x55,0x47,0x68,0x61,0x62,0x47,0x65,0x48,0x35,0x63,0x4E,0x64,0x56,0x49,0x3D};
static byte m_EncryptedKeyValue2010[24]={0x59,0x70,0x78,0x71,0x67,0x58,0x4F,0x42,0x5A,0x59,0x30,0x51,0x45,0x4E,0x51,0x31,0x38,0x75,0x7A,0x61,0x69,0x67,0x3D,0x3D};
static byte m_verifierHashInputblockKey2010[8]={0xfe,0xa7,0xd2,0x76,0x3b,0x4b,0x9e,0x79};
static byte m_verifierHashValueblockKey2010[8]={0xd7,0xaa,0x0f,0x6d,0x30,0x61,0x34,0x4e};
byte initialHashInput[128];
memcpy(initialHashInput, m_salt2010, 24);
memcpy(&initialHashInput[24], pPwd, nLen);

@anu

I’ll try and provide a set of conformance byte arrays for a 2010 work book.

You ask about step 3:
H(final)=H[H(n)+ block_key]
What is block_key in this step I am not understand

block_key=H(0)

H(0) depends what action you are performing. It could be:

m_verifierHashInputblockKey2010
m_verifierHashValueblockKey2010
m_encryptedKeyValueBlockKey

Hi Bills
if block_key=H(0) and It could be:
m_verifierHashInputblockKey2010
m_verifierHashValueblockKey2010
m_encryptedKeyValueBlockKey
then where is, I am wrong because I am using m_verifierHashInputblockKey2010 for block_key.I have already posted all my steps on February 21.
In my Protected File
encryptedKey saltValue=”2bJlhBIvCzBvQkxV2DelkA==”
m_salt2010[24]={0x32,0x62,0x4A,0x6C,0x68,0x42,0x49,0x76,0x43,0x7A,0x42,0x76,0x51,0x6B,0x78,0x56,0x32,0x44,0x65,0x6C,0x6B,0x41,0x3D,0x3D}
encryptedVerifierHashInput=”NiJkrrpC4e4RfdyYSQ9PRQ==”
m_verifierHashInput2010[24]={0x4E,0x69,0x4A,0x6B,0x72,0x72,0x70,0x43,0x34,0x65,0x34,0x52,0x66,0x64,0x79,0x59,0x53,0x51,0x39,0x50,0x52,0x51,0x3D,0x3D}
encryptedVerifierHashValue=”6lZhx9DxTMRDClA1lx27fGywU6kEZUGhabGeH5cNdVI=”
m_verifierHashValue2010[44]={0x36,0x6C,0x5A,0x68,0x78,0x39,0x44,0x78,0x54,0x4D,0x52,0x44,0x43,0x6C,0x41,0x31,0x6C,0x78,0x32,0x37,0x66,0x47,0x79,0x77,0x55,0x36,0x6B,0x45,0x5A,0x55,0x47,0x68,0x61,0x62,0x47,0x65,0x48,0x35,0x63,0x4E,0x64,0x56,0x49,0x3D}
encryptedKeyValue=”YpxqgXOBZY0QENQ18uzaig==”
m_EncryptedKeyValue2010[24]={0x59,0x70,0x78,0x71,0x67,0x58,0x4F,0x42,0x5A,0x59,0x30,0x51,0x45,0x4E,0x51,0x31,0x38,0x75,0x7A,0x61,0x69,0x67,0x3D,0x3D}
Password on File is pPwd=a={0x61,0x00};
static byte m_verifierHashInputblockKey2010[8]={0xfe,0xa7,0xd2,0x76,0x3b,0x4b,0x9e,0x79};
static byte m_verifierHashValueblockKey2010[8]={0xd7,0xaa,0x0f,0x6d,0x30,0x61,0x34,0x4e};
After Step 1 -> H(zero)=H(Salt + Password)=H(initialHashInput)
my result is
H(zero)={0x77,0xad,0x3d,0x1b,0xbb,0xf7,0x28,0xdd,0xdc,0xcf,0xe1,0x82,0xbb,0xcc,0x91,0x8a,0xff,0x6a,0xa2,0x9f}
After Step 2-> H(n)=H(iterator + H(n-1)) // SpinCount =100000 //
Result is H(n)={0x33,0xd9,0x17,0x8c,0x66,0x99,0x1c,0xe8,0x16,0x9d,0x59,0x00,
0xb7,0xd5,0x72,0xdf,0x9c,0x1c,0x08,0xa0}
After Step 3 -> H(final)=H[H(n)+ block_key]
Here I am using block_key m_verifierHashInputblockKey2010
Result is H(n)={0xbd,0x5d,0x97,0x66,0x72,0x82,0x84,0x78,0x9c,0xab,0xbe,0x33,0xf4,0x6f,0x30,0x82,0x2a,0x2b,0x0e,0xde,0xfe,0xa7,0xd2,0x76,0x3b,0x4b,0x9e,0x79}
and next steps are
DeriveKey(H(n), DIGEST_SHA1, &m_maKey.front(), m_KeySize/8 );
AES_KEY key;
sal_uInt8 pnVerifier[24] = { 0 };
sal_uInt8 pnVerifierHash[44] = { 0 }; AES_decrypt_key(&m_maKey.front(), m_KeySize, &key); AES_ecb_encrypt(m_verifierHashInput2010, pnVerifier, &key, AES_DECRYPT);
Now Result is
pnVerifier={0x58,0x8a,0xf1,0x89,0xc6,0xbc,0xe8,0x69,0x3a,0x21,0x16,0x77,0x6c,0xd8,0xcd,0x1d};
memcpy(GHashInput2010+ DIGEST_SHA1, m_verifierHashValueblockKey2010, 8);
Now Result is
H(GHashInput2010)={0x71,0xfa,0x3f,0xe5,0x4a,0x1e,0x38,0xa9,0x68,0x96,
0x72,0xcc,0xfa,0x77,0x90,0xf5,0x7b,0xe1,0x70,0x1d,0xd7,0xaa,0x0f,0x6d,0x30,
0x61,0x34,0x4e};
lclDeriveKey(H(GHashInput2010), DIGEST_SHA1, &m_maKey.front(), m_KeySize/8 ); AES_set_decrypt_key(&m_maKey.front(), m_KeySize, &key); AES_ecb_encrypt(m_verifierHashValue2010, pnVerifierHash, &key, AES_DECRYPT);
Now pnVerifierHash={0x2d,0xcc,0x9e,0x5f,0xe7,0xd5,0x83,0xc4,0x9f,0x8d,
0xe8,0xa4,0x74,0xf2,0xa5,0x62};
pnSha1Hash=H(pnVerifier);
if (!memcmp( pnSha1Hash, pnVerifierHash, 16 )) {
return pPwd;
}
return NULL;
I follow all your step but Hash is not match, not understand where is problem.
Thanks