
) acts like positive infinity, and the extended missing values (. If you have missing values in your data, you need to keep them in mind when writing if conditions. Sum yearsjob if race=2 | race=3 // do this instead Missing Values (What this does and why is left as an exercise for the reader, but it's not what you want.) Instead you should use: Sum yearsjob if race=2 | 3 // don't do this If you want summary statistics for years on the job for respondents who are either black (race=2) or "other" (race=3) you can not use: The second gives you summary statistics for years on the job for respondents who are male or have a household income of $10,000 or more, a very different group.Īny conditions you combine must be complete. The first gives you summary statistics for years on the job for respondents who are male and have a household income of $10,000 or more. Note the two equals signs! In Stata you use one equals sign when you're setting something equal to something else (see Creating Variables) and two equals signs when you're asking if two things are equal. This gives you summary statistics for years on the job for just the male respondents (in the GSS 1 is male and 2 is female). An if condition comes after a variable list: This allows you to do things with subsets of the data.

It will only act on those observations where the condition is true.

If ConditionsĪn if condition tell a command which observations it should act on. This gives you summary statistics for age, years on the job, and a rating of the respondent's job's prestige. If you list more than one variable, the command will act on all of them: Putting age after sum tells it to only give you summary statistics for the age variable. If you don't specify which variables sum should act on it will give you summary statistics for all the variables in the data set. First try sum (summarize) all by itself, and then followed by age: Variable ListsĪ list of variables after a command tells the command which variables to act on. Commands that can destroy data, like replace, cannot be abbreviated.

Many commands can be abbreviated: sum instead of summarize, tab instead of tabulate, reg instead of regress. Normally the command itself comes first and then you tell Stata the details of what you want it to do after. They tell Stata to do something: summarize, tabulate, regress, etc. Add the example commands to this do file as you go, and run it frequently to see the results. The example commands will go after use gss_sample and before log close. Create a new do file in that folder called syntax.do, as described in Doing Your Work Using Do Files. To carry out the examples in this section, you'll need to have created an SFS folder and downloaded the gss_sample data set as described in Managing Stata Files. Spending a little time learning the syntax itself will make it much easier to use commands later. Stata tries very hard to make all its commands work the same way. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section. This article is part of the Stata for Students series.
