# How to Sort a Data Frame by Multiple Columns in R

###### Data TutorialData Analytics

To begin understanding how to properly sort data frames in `R`, we of course must first generate a data frame to manipulate.

``````# run.R
# Generate data frame
dataframe <- data.frame(
x = c("apple", "orange", "banana", "strawberry"),
y = c("a", "d", "b", "c"),
z = c(4:1)
)

# Print data frame
dataframe
``````

Note: The spacing isn’t necessary, but it improves legibility.

Executing our `run.R` script outputs the list of vectors in our data frame as expected, in the order they were entered:

``````\$ Rscript run.R
x y z
1      apple a 4
2     orange d 3
3     banana b 2
4 strawberry c 1
``````

## The Order Function

While perhaps not the easiest sorting method to type out in terms of syntax, the one that is most readily available to all installations of `R`, due to being a part of the `base` module, is the `order` function.

The `order` function accepts a number of arguments, but at the simplest level the first argument must be a sequence of values or logical vectors.

For example, we can use `order()` to simply sort a vector of five randomly ordered numbers with this script:

``````# Create unordered vector
vector = c(2, 5, 1, 3, 4)

# Print vector
vector

# Sort in ascending order
vector[order(vector)]
``````

Executing the script, we see the initial output of the unordered vector, followed by the now ordered list afterward:

``````\$ Rscript run.R
 2 5 1 3 4
 1 2 3 4 5
``````

## Sorting a Data Frame by Vector Name

With the `order()` function in our tool belt, we’ll start sorting our data frame by passing in the vector names within the data frame.

For example, using our previously generated `dataframe` object, we can sort by the vector `z` by adding the following code to our script:

``````# Sort by vector name [z]
dataframe[
with(dataframe, order(z)),
]
``````

What we’re effectively doing is calling our original `dataframe` object, and passing in the new index order that we’d like to have. This index order is generated using the `with()` function, which effectively creates a new environment using the passed in data in the first argument along with an expression for evaluating that data in the second argument.

Thus, we’re reevaluating the `dataframe` data using the `order()` function, and we want to order based on the `z` vector within that data frame. This returns a new index order for the data frame values, which is then finally evaluated within the [brackets] of `dataframe[]`, outputting our new ordered result.

``````\$ Rscript run.R
x y z
1      apple a 4
2     orange d 3
3     banana b 2
4 strawberry c 1
x y z
4 strawberry c 1
3     banana b 2
2     orange d 3
1      apple a 4
``````

Consequently, we see our original unordered output, followed by a second output with the data sorted by column `z`.

## Sorting by Column Index

Similar to the above method, it’s also possible to sort based on the numeric `index` of a column in the data frame, rather than the specific name.

Instead of using the `with()` function, we can simply pass the `order()` function to our `dataframe`. We indicate that we want to sort by the column of index `1` by using the `dataframe[,1]` syntax, which causes `R` to return the levels (names) of that index `1` column. In other words, similar to when we passed in the `z` vector name above, `order` is sorting based on the vector values that are within column of index `1`:

``````dataframe[
order( dataframe[,1] ),
]
``````

As expected, we get our normal output followed by the sorted output in the first column:

``````\$ Rscript run.R
x y z
1      apple a 4
2     orange d 3
3     banana b 2
4 strawberry c 1
x y z
1      apple a 4
3     banana b 2
2     orange d 3
4 strawberry c 1
``````

## Sorting by Multiple Columns

In some cases, it may be desired to sort by multiple columns. Thankfully, doing so is very simple with the previously described methods.

To sort multiple columns using vector names, simply add additional arguments to the `order()` function call as before:

``````# Sort by vector name [z] then [x]
dataframe[
with(dataframe, order(z, x)),
]
``````

Similarly, to sort by multiple columns based on column index, add additional arguments to `order()` with differing indices:

``````# Sort by column index  then 
dataframe[
order( dataframe[,1], dataframe[,3] ),
]
``````