STATA is a multi-purpose statistical package which helps the user to summarize and analyze datasets. A dataset is a group of numerous pieces of information, called variables and is typically organized by columns. Other statistical softwares which are commonly used are SPSS, SAS and R, where students mostly need help with statistics assignments. The most used statistical software is STATA and it is widely used in social sciences study. Variables within a dataset are normally structured in columns, while rows signify different explanations of a given variable. An important feature is the format of the datasets which comes either in the ASCII (American Standard Code for Information Interchange) text format or in STATA format.

Data can either be stored in a distinct file, which is called DATA or using STATA in the interactive mode. In STATA, text format data files have the suffix ".RAW", though STATA format data files will accept the suffix".DTA" also. The text format datasets may accept another suffix like ".TXT". Beginning STATA from the windows menu permits to enter the cooperative mode of the program, which means that the commands typed, will be executed one by one. The prompt "." specifies that the user is within STATA. The first thing performed on entering into windows mode is to enter the datain STATA's memory. If the data are in textformat, the command INFILEVAR1 VAR2 by giving the path as c:\path\DATA, where VAR1 and VAR2 and possibly VAR3 are the names given to the variables that make up DATA. The user must specify the drive and path of the directory. The DATA file is stored in the specified path and the maximum length of a filename is of eight characters.

In STATA format, USE c:\path\DATA is the command to be used. STATA format data files already contain variable names, not like the text files. Hence it does not need to be re-specified. A STATA format data file is created from a text format file by first loading the text format data by means of the INFILE command, and to save it by the command SAVE c:\path\DATA, that makes a file called "DATA.DTA" in the specified directory. The first thing to analyze the data in the dataset is to make sure the right file is loaded, and to get a rough idea of its mechanisms. The command DESCRIBE gives certain information on the data, which STATA keeps in its memory, like the quantity of variables, number of comments, names of variables, etc. The command "LIST" shows all the variables in STATA's current memory. Example: LISTVAR1 VAR2 command shows the variables with the name as VAR1, VAR2. The LIST IN 1/10 command shows the first 10 interpretations of all the variables in the dataset.

Command "SUMMARIZE" provides statistics like mean, standard deviation of the variables in the memory of STATA. SORTVAR1 is the command that rearranges the data in such a way that it makes VAR1 appear in an ascending order. Least squares regression is one of the essential statistical methods and problems can be solved in STATA. The interactive mode in STATA requires the user to enter commands one by one, and get the outcomes one at a time.