
Occasionally you may want to obtain a random
sample of your data. For example, you have 1000
cases of a variable and you would like a random
sample of approximately 200 or 1/5.
Let's look at a Statit function that would
be helpful. ranunfrm(a,b) will give you a random
number between "a" and "b"
from the uniform distribution. Either or both
"a" and "b" can be Statit
variables. Using the 1000 cases example, we
could:
##
Create tmp with 1000 cases with value 100
assi tmp 1000*100
##Create
a random value between 0 and 100 for each of
the 1000 cases
let ran = ranunfrm(0,tmp)
##
Select the some range of those values that would
represent about 1/5
select push (ran > 20 and ran <=40)
Now suppose we have samples in our data. Perhaps
we have a variable called sample_id where each
sample_id has 3 measurements:
| Sample_ID |
Measurement |
| 1 |
45 |
| 1 |
56 |
| 1 |
43 |
| 2 |
82 |
| 2 |
93 |
| 2 |
45 |
| 3 |
56 |
| 3 |
44 |
| 3 |
93 |
The following macro script would select about
twenty percent of the samples randomly.
##
Group by sample_id and save
group measurement by sample_id /save
##
How many samples do we have
let $case = case(group.mean)
##
Create a random value for each of these samples
assi ran = $case * 100
let ran = ranunfrm(0,ran)
##
Identify the samples we don't want by assigning
the missing value to them
if ran > 20 then ran = #_sysmiss
##
Do a match merge to integrate the two sets of
data. See match-merge in the
## Edit -> Data Management Menu
match measurement with ran by sample_id Group.sample_id
##
Now use global select, local select or select
permanent as you wish
stats measurement /select=(ran != #_sysmiss)
Now, create some data in the workspace and
work through the example above to see how it
works step by step.
This is only one of many ways to get a random
sample. There are many others, as well as several
other random number generators that you could
use.
If you would like additional information, please
call our Support staff at (541) 752-4100 or
send email to
.
|