When doing analysis of source code archives from an unknown origin it can be helpful to find out where the code originated from geographically. Comments in these files can be helpful, as they are quite often written in the native natural language of the developer. Finding out which language the file is in can help understanding the flow of the code (example: translating comments) and provenance. By analyzing the contents of a file and seeing which character sets the contents belong to a better guess can be made.

