Monday 27 February 2023

“HACKING” OFFICE FILES

 



Introduction

You probably use or have used at least once an Office product (Word, Excel and PowerPoint), but do you know that you can explore their source content and see how they actually work? I’ll guide you through Office files exploration…

 

Getting started

To get started, you just need a computer with an Office document (here I’ll use a Word document because it has the simplest structure).

Now, you should rename your file by adding a “.zip” to the end: this will allow us to explore the content of this file, because Office files (surprisingly) are not of an encoded format, but are “just” a Zip file (with a little extra stuff) containing other files with all information needed to render the document you’re actually writing.

If you are not able to rename the file on your own, you can follow this guide.

After renaming the file extension, you should unzip the file (on Windows use right-click, then “Extract All” and click Ok, on Mac you just need to double-click the file) and then open the folder created, now you should see a file structure (if you just see a folder open it).

 

Let’s explore

The opened folder should have a structure like this 

 


 

Interesting files

document.xml

This file contains most of the text content of your document, including the structure (heading, bold, list) of the document itself.

 “media” folder

It contains all the images of the document (usually named as you imported in the document).

 “fonts” folder

IT  contains all the fonts (which are not included in the system) used in that document.

Other files

[Content_Types].xml

This file includes all the data that make it actually a Word document and differentiates it from other Office files (Excel and PowerPoint).

 fontTable.xml

This file includes data related to the fonts of the document.

 numbering.xml

Not relevant.

 settings.xml

An important piece of information, containing the setup of the document.

 styles.xml

Additional settings of all the fonts.

 “theme” folder

This is very important, too, because it contains additional stylings and can be compared to a PowerPoint theme, which changes the look of the document itself.

 

“_rels” folder

Contains additional (“rels” stays for “related”) data of the document: for example, the “document.xml.rels” file contains all links of the document.

 

Conclusion

Though very “specific”, I felt I had to write this because even if nowadays most computers have a method to read office files (you can also use Google Docs which is a website), you may find that, once in your life,  you are not able to read that document. In this case, either  you lose all your work, or you can try to work around this situation.

For example, the “Interesting files” section is very useful if you want to recover some content, an image or just find out the wonderful font used in that document; also the “document.xml.rels” file, (though very difficult to read) may be useful if you want to recover a link you remember was there.

Giorgio B., 5scB

No comments:

Post a Comment