Now that you installed RStudio, learned about assignments and wrote some basic code, there’s nothing stopping you from becoming a journocoder!

To get a deeper understanding of how R stores your data, we’re now going to take a closer look at data structures in R, starting with a central concept: Vectors.

Working with vectors

You will work with vectors a lot in R — and I mean a lot. R loves vectors. It treats a scalar — a single value — as nothing but a vector with only one value. There’s all kinds of data structures in R, but most of them are basically just different compositions of vectors. We will get to know them better as we go along. For example, a matrix consists of a vector cut into multiple pieces of the same length. A list is a combination of vectors with different lengths and R even manages to see data frames as something made of vectors. So if you know how to handle vectors in R, that’s a good step towards coding proficiency.

Vectors are created with the c()-function. Like single values, you can name your vectors however you want and perform all kinds of calculations on them.

Elements of a vector are seperated by a comma in the c() function, but you can generate sequences of numbers in different ways. For example, if you write „1:10“ instead of a value, R will add the numbers 1 through 10 to your vector. Also, instead of writing „c(3,3,2,2)“, you can tell R to repeat the numbers 3 and 2 two times each with the rep() function — like I did below with the variable h2. You can also tell R to repeat a whole sequence like with p2. Run the code below and have a closer look at the variables and the output R returns.

myvector <- c(2,5,8,3,40)
g <- c(10,22,5,71,-5) 
myvector+g

k1 <- c(1,2,3,4,5,6,7,8,9,10,60)
k2 <- c(1:10, 60) 
k1
k2

h1 <- c(3,3,2,2)
h2 <- rep(c(3, 2), each=2)
h1
h2

p1 <- c(0,1,2,0,1,2,0,1,2)
p2 <- c(rep(0:2, times=3)) 
p1
p2

Try to create some vectors in different ways by yourself!

Now, define two vectors of the same length (with the same number of elements) and try to do some basic math you’ve learned in the chapter before. For example, try:

a <- c(2, 3, 5, 30)
b <- c(1, -5, 100, 4)

a+b
b/2
a/b
a^3
b*b*-a

Try some more things if you want. Now go for the basic math functions:

sqrt(a) # try this with b to see how R handles impossible operations!
sum(a)
sum(a,b) 
log(a) # try this with b, too
abs(b)

In the last chapter I said sum(5, 4) does the same as 5+4. Is this still true when it comes to vectors? Compare the results!

Operations like sqrt() and log() can only be applied to positive values. They will work for every positive value of your vector but will give you an error message and return NaN instead of a result for the negative elements. NaN stands for „not a number“. It is possible to work with a vector containing NaNs, but you should double check if you actually want them in there.

log(b)+a

 

Watch out!

So far for vectors of the same length. What about vectors that have a different number of elements? Try this:

n <- c(1, 2)
m <- c(4, 5, 6, 7)

n+m
n*m
m/n

Works well, hm? But why? The answer is something you should keep in mind: If (for an operation where the vectors have to be the same length) one vector is shorter than the other, R repeats the elements of the shorter vector until the two are the same length! So for „n+m“, R doesn’t calculate „(1, 2)+(4, 5, 6, 7)“ but „(1, 2, 1, 2)+(4, 5, 6,7)“.

 

Interesting functions for your first data analysis

Let’s look at a few useful functions that can help you analyze vectors. Remember to use the help functions or the internet if you don’t understand a function.

length(m) # length/number of elements
sort(a) # orders the vector elements (increasing)
min(n) # smallest value of the vector n
max(n) # biggest value of the vector n

sum(b[b<0]) # sum of only the negative (smaller than zero) elements of b

But wait, there’s more: You can round vector elements or turn a vector to a matrix. Look closely at the output of this piece of code: What is the difference between C and C2? What is the difference between C2 and C3?

c <- c(2, 39, 4, 32, 54, 2)
round(c, -1) # rounds elements to tens
round(c, -2) # rounds to hundereds

# turn your vector into a matrix:
C <- matrix(c, 3, byrow=TRUE)
C
C2 <- matrix(c, 2, byrow=TRUE)
C2

C3 <- matrix(c, 2)
C3

Functions, as you may have already noticed, can work with different parameters that determine their output. These are called arguments. They can be specified in parentheses after the function name. Here, the second argument of the matrix function tells R how many rows the matrix will have. The logical argument byrow controls in what way my matrix will be filled with the vectors elements. Because this is a crash course, we won’t go much further into vectors and matrices. But if you want to learn more about them, go for it!

 

Oo-de-lally!

At this point you know enough about programming in R to have a closer look at what’s useful for journocoding! In the meantime, it’s always a good idea to play around with what you’ve already learned!

In the next chapters, we will get to know other data structures, like lists or data frames. We will learn how to load data into the workspace, like excel sheets or csv files.  And we will have a look at the most important statistical values that are interesting for journocoders like you and how R can help you analyze and visualize your data. You will learn how to use and write functions and how to use packages in R. Sounds awesome, right? Let’s do it!

 

{Credits for the awesome featured image go to Phil Ninh}