******************************* *** Firse Steps *************** ******************************* *** Always set your working directory first cd D:\Stata // change your working directory (you should choose your own directory and enter the full path here) *** Use a log file to keep track of whatever you have done log using log1,replace // You can only keep one log file open, // so opening another one will give you an error message: log using log2,replace // The command will work if you close the existing one. // So you should close all existing logs: log close _all // now let's open the first log file again, and apecify the "text" option to save it in txt format log using week1.txt,text replace // Let's also create a command log: cmdlog using cmdweek1.txt,replace // We will learn what it is later. ******************************* *** View Data ******************************* *** Open a Dataset sysuse auto,clear // clear existing dataset in memory // Check what the variables and properties panels look like *** Think of a few ways how you would like to explore the data: *** Most directly way is to simply take a look of the data: browse // Structure of the data // Explain color, missing, categorial data // Note that you can only read the data without editting it. // In order to edit the data, type "edit" instead: edit // I strongly discourage you from editting data this way. *** To know more details about the data (structure) describe // clear more // elements displayed *** Summary statistics summarize // why "make" has no observations // rep78 have less obs, because of missing values *** To learn the contents of the data in detail codebook make // unite values codebook foreign // value labels codebook rep78 // missings *** To learn more about the missing data browse if missing(rep78) list make if missing(rep78) *** More summary statistics, e.g., median summarize,detail // notice that mean is very small browse if price>13000 *** Learn the composition of foreign/domestic and repairing records tabulate foreign tabulate rep78 tabulate foreign rep78,row *** To compare gas mileages between foreign and domestic cars: *** Think about the potential ways to do this *** Method 1 summarize mpg if foreign==0 summarize mpg if foreign==1 // equal sign *** Method 2 by foreign,sort: summarize mpg *** Method 3 tabulate foreign, summarize(mpg) *** Hypothesis testing ttest mpg,by(foreign) *** Relationships between mpg and weight correlate mpg weight by foreign,sort: correlate mpg weight correlate mpg weight length turn displacement *** Graph data twoway (scatter mpg weight) twoway (scatter mpg weight),by(foreign, total) *** Model fitting generate wtsq=weight^2 regress mpg weight wtsq foreign predict mpghat twoway (scatter mpg weight) (line mpghat weight,sort),by(foreign) generate gp100m=100/mpg label variable gp100m "Gallons per 100 miles" twoway (scatter gp100m weight),by(foreign,total) regress gp100m weight foreign *** Close and save the log files before you go log close _all cmdlog close