Jeromy Anglim's Blog: Psychology and Statistics


Thursday, September 24, 2009

Recovering a Corrupted Excel 2007 File | XLSX and XLSM Format

An Excel 2007 file that I was working on recently became corrupted. The following are some things that I learnt in my process of recovering the file.

  • I tried various recovery options that did not work: Recovery Toolbox, clicking the open and repair option in Excel; searching for auto save files in User - Application Data - Microsoft - Excel; I tried changing the file extension to ".xls", ".xlsm", "xlsx". Basically I kept getting the error message "Excel cannot open the file ... because the file format or the file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file."
  • Excel 2007 files (e.g., ".xlsx", ".xlsm") are in an XML format. If you change the file extension (i.e., "xlsx", "xlsm") to ".zip", you can unzip the Excel file and look at its components. JKP provides a helpful overview of the file format. 
  • The "xl" folder in the extracted zip file seemed to contain most of what I needed, although it was not arranged in a particularly accessible form. In particular, "sharedStrings.xml" had most of the cell content but in a disorganised form. "workbook.xml" had the workbook names and "tables\...xml" had the column names for my tables.
  • If you drag and drop the "xml" files into Excel into the xl folder, you can open them as a table and have a look.
Thus, with a little fiddling I was able to recreate my Excel file.



My file was called "meta.xlsx"; this is what it looked like after converting to "meta.zip" and extracting the zip file.

24 comments:

  1. I was in similar situation a lot of times,but every times next software helped me-repair Excel file.It is free as far as I remember and can also extract information from corrupted Microsoft Excel files.

    ReplyDelete
  2. Hi Alex,
    Yes. I tried The Recovery Toolbox. It did not work in my case. The software publishers do allow you to trial it for free.

    ReplyDelete
  3. Hello,

    Hi Jeromy,
    I have had tremendous help with the 'zip' trick. Only one question left. I have been able to recover my excelsheet, except for the vbcode.
    I have found the vb code in the zip file as "vbaProject.bin". In what way can I retrieve the information of this "vbaProject.bin" file?

    Emmo

    ReplyDelete
  4. Hi Emmo,
    I'm not sure. I am guessing that you have already tried to open the file in a text editor like Notepad?
    cheers,
    Jeromy

    ReplyDelete
  5. Yes,
    Then I only get very strange non-readable characters.

    Emmo

    ReplyDelete
  6. Jeromy,

    found the solution. I have made a 'test' excelsheet with an ampty vb module.
    Renamed the .xlsm to .zip and acutally unzipped the file. Then I replaced the vbaProject.bin with my saved vbaProject.bin file I wanted to retrieve.
    Zipped all data, renamed back to .xlsm and .......
    YES I got my vb code back.

    Thanks for the help.

    Emmo

    ReplyDelete
  7. @Emmo Thanks for sharing your solution

    ReplyDelete
  8. i format may pc but i forget the file in excel his encrypt then i cannot open!"Excel cannot open the file ... because the file format or the file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file."im too much tired!pls help me....

    ReplyDelete
  9. my file wont even unzip. do you think you can help with this?

    ReplyDelete
  10. @Anonymous July 30 2010: several possibilities:
    (1) The file is not xlsx/xlsm;
    (2) have you changed the file extension to zip.
    Perhaps, try asking the question on superuser.com explaining in detail your situation.

    ReplyDelete
  11. Wow, I was desperately seeking a solution for my corrupted xlsm-File and I finally found a extremely successful surprisingly elegant way to resolve the problem for my seriously damaged workbook. It worked very fine for me. Thanks!

    ReplyDelete
  12. Hi Sheolina,

    How did you do that?

    ReplyDelete
  13. Hi Jeromy,

    my porblem: probably unplugged my USB drive without saving/exiting Excel. What i have now is a .xlsm file and its ~$ file. Excel complains about extension or corruption problem. RecoveryTB didnt help. unzipping file opens only several files in 2 folders + a corruption message.

    Any suggestion what to do from here?


    Thanks
    GadyC

    ReplyDelete
  14. Hi GadyC. I don't think I can help. The unzip trick assumes that when you unzip the Excel file there is sufficient uncorrupted data to restore the file. Besides that you could always see what people say on http://superuser.com/

    ReplyDelete
  15. Ran into a similar problem when going from Office 2003 to Office 2007. To fix this, edit your Registry.

    For Office 2007:
    Windows Registry Editor Version 5.00
    [HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Excel\Security]
    "AccessVBOM"=dword:00000001
    "ExtensionHardening"=dword:00000000

    For Office 2010:
    Windows Registry Editor Version 5.00
    [HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Excel\Security]
    "AccessVBOM"=dword:00000001
    "ExtensionHardening"=dword:00000000

    You can either manually make the above entries or copy and paste the text into Notepad and save it using the ".reg" extension. If you want to make a reg file make sure you leave a space between the first line, "Windows Registry Editor Version 5.00", and the next line, and place a space after the last line.

    To save your reg file, go to FILE | SAVE AS, browse to the place you want to save it and when you give it the name, enclose the name in quotes; e.g. "ExcelExtensionHardening.reg". Make sure to do all this in Notepad, not Word, WordPad, etc., as those programs will add things that will make the reg file useless, and could possiblly hose your entire OS.

    To apply your reg file, just double-click it.

    ReplyDelete
  16. Regarding Emmo's problem with the VbaProject part, I use a tool called VbaDiff. It seems this app can read code straight out of the Excel file, so it's worth a try if anyone has a similar problem.

    Dave

    ReplyDelete
  17. all great advice - anyone know how i can recover text from inserted text boxes (as they are missing when i open the file after going through the series of corrupted file messages, as in it appears that all text boxes are removed upon opening the corrupted file)...

    ReplyDelete
  18. @ Emmo
    March 15, 2010 2:31 AM

    This was genius and saved me a ton. Thank you !!!

    ReplyDelete
  19. I don't know if this will help anyone else, but I had a .xlsx file which gave the "Excel cannot open the file because the file format or extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file" error in Excel 2007 SP2. The "Open and Repair" method did not work, and I could not save the file as a .zip extensions. What did work, however, was opening the file in OpenOffice and resaving as a different filename. Good luck!

    ReplyDelete
  20. I have tried to zip the file but still getting error no archive found to unzip... can you suggest me on this
    Alex you said you have used one software to repair can you mention the name of the software so that i can download the try it

    ReplyDelete
  21. Same problem. Repair won't work. Unzipping works (kind of).

    Emmo's solution might.

    ReplyDelete
  22. I have tried to do all things mentioned above.
    Repair does not work, Opening as text provides only unreadable characters. Opening with OpenOffice provides about the same result as opening as text. Making it zip does not work.
    I have encountered this problem several times and I wonder whether I have any options left?

    ReplyDelete
  23. All methods which are mentioned in the above post are excellent but I like second method most because it can be performed by any user very easily and repair excel file is quick time.

    ReplyDelete