# Collections

## In this lecture

- [Introduction](#Introduction)
- [Arrays](#Arrays)
- [Tuples](#Tuples)
- [Dictionaries](#Dictionaries)

## Introduction

Collections are groups of elements.  These elements are values of different Julia types.  Storing elements in collections is one of the most useful operations in computing.

## Arrays

Arrays are collections of values separated with commas and placed inside of a set of square brackets.  They can be represented in column or in row form.

In [1]:
# A column vector
array1 = [1, 2, 3]

3-element Array{Int64,1}:
 1
 2
 3

The `typeof()` function shows that `array1` is an instance of an array object, containing integer values.

In [2]:
# The type of the object array1
typeof(array1)

Array{Int64,1}

Below we create `array2`.  Note that there are only spaces between the elements.

In [3]:
# A row vector
array2 = [1 2 3]

1×3 Array{Int64,2}:
 1  2  3

The `transpose()` function will create a linear algebra transpose of our column vector, `array1`.

In [4]:
# The transpose
transpose(array1)

1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}:
 1  2  3

When the types of the elemnts are not the same, all elements _inherit_ the _highest_ type.

In [5]:
# With a mix of types, all the elements inherent the "highest" type
array2 = [1, 2, 3.0]

3-element Array{Float64,1}:
 1.0
 2.0
 3.0

In [6]:
# Index for one of the original integers will be Float64
array2[1]

1.0

Arrays can have more than one _dimension_ (here dimension does not refer to the number of elements in a vector, representing a vector field).

In [7]:
# Column-wise entry of multidimensional array
array3 = [[1, 2, 3] [4, 5, 6] [7, 8, 9]]

3×3 Array{Int64,2}:
 1  4  7
 2  5  8
 3  6  9

In [8]:
# Row-wise entry of multidimensional array
array4 = [[1 2 3]; [4 5 6]; [7 8 9]]

3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

The `length()` function returns the number of elements.

In [9]:
# Length of array3
length(array3)

9

In [10]:
length(array4)

9

Since the two arrays above were created differently, let's take a look at indices of there elements.

In [11]:
# Index order of column-wise array
for i in 1:length(array3)
    println("Element $(i) is ", array3[i])
end

Element 1 is 1
Element 2 is 2
Element 3 is 3
Element 4 is 4
Element 5 is 5
Element 6 is 6
Element 7 is 7
Element 8 is 8
Element 9 is 9


In [12]:
# Index order of row-wise array
for i in 1:length(array4)
    println("Element $(i) is ", array4[i])
end

Element 1 is 1
Element 2 is 4
Element 3 is 7
Element 4 is 2
Element 5 is 5
Element 6 is 8
Element 7 is 3
Element 8 is 6
Element 9 is 9


Elements can be repeated using the `repeat()` function.

In [13]:
# Using repeat() to repeat column elements
repeat([1, 2], 3)

6-element Array{Int64,1}:
 1
 2
 1
 2
 1
 2

In [14]:
# Using repeat() to repeat row elements
repeat([1 2], 3)

3×2 Array{Int64,2}:
 1  2
 1  2
 1  2

The `range()` function ccreates a range object.  The first argument is the value of the first element.  The `step = ` argument specifies the stepsize, and the `length =` argument specifies how many elements the array should have.

In [15]:
# Using range(start, step, number of elements)
range(1, step = 1, length = 10)

1:1:10

We can change the range object into an array using the `collect()` function.

In [16]:
# Create collections using the collect() function
collect(range(1, step = 1, length = 10))

10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

In [17]:
# Short-hand syntax
collect(1:10)

10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

We can create empty arrays as placeholders.

In [18]:
# Creating empty array with two rows and three columns
array5 = Array{Union{Missing, Int}}(missing, 2, 3)

2×3 Array{Union{Missing, Int64},2}:
 missing  missing  missing
 missing  missing  missing

Reshaping is achieved using the `reshape()` function.

In [19]:
# Reshaping
reshape(array5, 3, 2)

3×2 reshape(::Array{Union{Missing, Int64},2}, 3, 2) with eltype Union{Missing, Int64}:
 missing  missing
 missing  missing
 missing  missing

Every element in an arrays has an index (address) value.  We already saw this above when we created a for-loop to cycle through the values of our row vs. column created arrays.

In [20]:
# Creating a 10 x 5 array with each element drawn randomly from value 10 through 20
array6 = rand(10:20, 10, 5)

10×5 Array{Int64,2}:
 20  12  11  13  18
 10  17  10  15  19
 16  13  11  11  13
 10  17  17  11  17
 16  20  17  13  13
 15  11  15  20  20
 20  17  20  19  20
 10  10  19  15  17
 13  14  10  10  20
 15  18  18  10  13

Indexing is indicated with square brackets.  For arrays with rows and columns, the index values will be in the form `[row, column]`.  A colon serves as short-hand syntax indicating _all_ values.

In [21]:
#A ll rows in first column
array6[:, 1]

10-element Array{Int64,1}:
 20
 10
 16
 10
 16
 15
 20
 10
 13
 15

In [22]:
# Rows two through five of second column
array6[2:5, 2]

4-element Array{Int64,1}:
 17
 13
 17
 20

In [23]:
# Values in rows 2, 4, 6, and in columns 1 and 5
array6[[2, 4, 6], [1, 5]]

3×2 Array{Int64,2}:
 10  19
 10  17
 15  20

In [24]:
# Values in row 1 from column 3 to the last column
array6[1, 3:end]

3-element Array{Int64,1}:
 11
 13
 18

Boolean logic can be used to select values based on rules.  Below we check if each value in column one is equal to or greater than $12$.

In [25]:
# Boolean logic (returning only true and false)
array6[:, 1] .> 12

10-element BitArray{1}:
 1
 0
 1
 0
 1
 1
 1
 0
 1
 1

We can add values to an array using the `push!()` function.  Many functions in Julia have an added exclamation mark, called a _bang_.  It is used to make permanent changes to the values in a computer variable.

In [26]:
# Creating a five element array
array7 = [1, 2, 3, 4, 5]
# Permanantly append 10 to end of array
push!(array7, 10)

6-element Array{Int64,1}:
  1
  2
  3
  4
  5
 10

The `pop!()` function removes the last element (the bang makes it permanent).

In [27]:
pop!(array7)

10

We can also change the value of an element by using its index.

In [28]:
# Change second element value to 1000
array7[2] = 1000

1000

In [29]:
# Viewing the change
array7

5-element Array{Int64,1}:
    1
 1000
    3
    4
    5

_List comprehension_ is a term that refers to the creating of an array using a _recipe_.  View the following example.

In [30]:
# An example of list comprehension
array8 = [3 * i for i in 1:5]

5-element Array{Int64,1}:
  3
  6
  9
 12
 15

The Julia syntax is very expressive, as the above example shows.  Square brackets indicate that we are creating a list.  The exprssion, `3 * i` indicates what we want each element to look like.  The for-loop uses the palceholder over which we wish to iterate, together with the range that we require.

This allows for very complex array creation, which makes it quite versatile.

In [31]:
# Column-wise collection iterating through second element first
array9 = [a * b for a in 1:3, b in 1:3]

3×3 Array{Int64,2}:
 1  2  3
 2  4  6
 3  6  9

Arithmetic operations on arrays are performed through the process of _broadcasting_.  Below we add $1$ to each element in `array8`.

In [32]:
# Elementwise addition of a scalar using dot notation
array8 .+ 1

5-element Array{Int64,1}:
  4
  7
 10
 13
 16

When arrays are of similar shape, we can do elemnt wise addition.

In [33]:
# Elementwise addition of similar sized arrays
array7 + array8

5-element Array{Int64,1}:
    4
 1006
   12
   16
   20

While it is nice to have a complete set of elemnts, data is often _missing_.  Missing is a Julia data type that provides a placeholder for missing data in a statistical sense.  It propagates automatically and its equality as a type can be tested.  Sorting is possible since missing is seen as greater than other values.

In [34]:
# Propagation
missing + 1

missing

In [35]:
missing > 1

missing

In [36]:
[1, 2, 3, missing, 5] + [10, 20, 30, 40 ,50]

5-element Array{Union{Missing, Int64},1}:
 11       
 22       
 33       
   missing
 55       

In [37]:
# Checking equality of value using ==
# Cannot return true or false since value is not known
missing == missing

missing

In [38]:
# Checking equality of type with ===
missing === missing

true

In [39]:
# Checking type equality with isequal()
isequal(missing, missing)

true

In [40]:
# Sorting with isless()
isless(1, missing)

true

In [41]:
# Checking on infinity
isless(Inf, missing)

true

We can create an array of zeros.

In [42]:
# A 3 x 3 array of integer zeros
array11 = zeros(Int8, 3, 3)

3×3 Array{Int8,2}:
 0  0  0
 0  0  0
 0  0  0

Here is an array of ones.

In [43]:
# A 3 x 3 array of floating point ones
array12 = ones(Float16, 3, 3)

3×3 Array{Float16,2}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

Boolean values are also allowed.

In [44]:
# Array of true (bit array) values
array13 = trues(3, 3)

3×3 BitArray{2}:
 1  1  1
 1  1  1
 1  1  1

We can even fill an array with a specified value.

In [45]:
# Fill an array with elements of value x
array14 = fill(10, 3, 3)

3×3 Array{Int64,2}:
 10  10  10
 10  10  10
 10  10  10

We have already seen that elemnts of different types all inherit the _highest_ type.  We can in fact, change the type manually, with the convert function.  As elsewhere in Julia, the dot opetaror maps the function to each element of a list.

In [46]:
# Convert elements to a different data type
convert.(Float16, array14)

3×3 Array{Float16,2}:
 10.0  10.0  10.0
 10.0  10.0  10.0
 10.0  10.0  10.0

Arrays can be concatenated.

In [47]:
# Concatenate arrays along rows (makes rows)
array15 = [1, 2, 3]
array16 = [10, 20, 30]
cat(array15, array16, dims = 1)

6-element Array{Int64,1}:
  1
  2
  3
 10
 20
 30

In [48]:
# Same as above
vcat(array15, array16)

6-element Array{Int64,1}:
  1
  2
  3
 10
 20
 30

In [49]:
# Concatenate arrays along columns (makes columns)
cat(array15, array16, dims = 2)

3×2 Array{Int64,2}:
 1  10
 2  20
 3  30

In [50]:
# Same as above
hcat(array15, array16)

3×2 Array{Int64,2}:
 1  10
 2  20
 3  30

## Tuples

Tuples are immutable collections.  Immutable refers to the fact that the values are set and cannot be changed.  This type is indicated by the use of parenthesis instead of square brackets.

In [51]:
# Tuples with mixed types
tuple1 = (1, 2, 3, 4, "Julia")

(1, 2, 3, 4, "Julia")

Let's check on the values and types of each element.

In [52]:
# For loop to look at value and type of each element
for i in 1:length(tuple1)
    println(" The value of the tuple at index number $(i) is $(tuple1[i]) and the type is $(typeof(tuple1[i])).")
end

 The value of the tuple at index number 1 is 1 and the type is Int64.
 The value of the tuple at index number 2 is 2 and the type is Int64.
 The value of the tuple at index number 3 is 3 and the type is Int64.
 The value of the tuple at index number 4 is 4 and the type is Int64.
 The value of the tuple at index number 5 is Julia and the type is String.


Tuples are useful as each elemnt can be named.

In [53]:
# Each element can be named
a, b, c, seven = (1, 3, 5, 7)
a

1

In [54]:
seven

7

A range can be used to reverse the order of a tuple.

In [55]:
# Reverse order index (can be done with arrays too)
tuple1[end:-1:1]

("Julia", 4, 3, 2, 1)

Arrays can be made up of elemnts of different length.

In [56]:
# Mixed length tuples
tuple2 = ((1, 2, 3), 1, 2, (3, 100, 1))

((1, 2, 3), 1, 2, (3, 100, 1))

In [57]:
# Element 4
tuple2[4]

(3, 100, 1)

In [58]:
# Element 2 in element 4
tuple2[4][2]

100

## Dictionaries

Dictionaries are collection sof key-value pairs.

In [59]:
# 1 Example of a dictionary
dictionary1 = Dict(1 => 77, 2 => 66, 3 => 1)

Dict{Int64,Int64} with 3 entries:
  2 => 66
  3 => 1
  1 => 77

In the example above we have key-values of `1,2,3` and value-values of `77,66,1`.

In [60]:
# The => is shorthand for the Pair() function
dictionary2 = Dict(Pair(1,100), Pair(2,200), Pair(3,300))

Dict{Int64,Int64} with 3 entries:
  2 => 200
  3 => 300
  1 => 100

We can specify the types used in a dict.

In [61]:
# 2 Specifying types
dictionary3 = Dict{Any, Any}(1 => 77, 2 => 66, 3 => "three")

Dict{Any,Any} with 3 entries:
  2 => 66
  3 => "three"
  1 => 77

In [62]:
# We can get a bit crazy
dictionary4 = Dict{Any, Any}("a" => 1, (2, 3) => "hello")

Dict{Any,Any} with 2 entries:
  (2, 3) => "hello"
  "a"    => 1

It is perhaps more useful to use symbols (colon symbol and a name) as key values.  We can then refer to the key-name when we want to inquire about its value.

In [63]:
# Using symbols as keys
dictionary5 = Dict(:A => 300, :B => 305, :C => 309)
dictionary5[:A]

300

We can check on the key-value pairs in a dictionary.

In [64]:
# Using in() to check on key-value pairs
in((:A => 300), dictionary5)

true

Change value using the key is easy to perform.

In [65]:
# Changing an existing value
dictionary5[:C] = 1000
dictionary5

Dict{Symbol,Int64} with 3 entries:
  :A => 300
  :B => 305
  :C => 1000

The `delete!()` function permanently deletes a key-value pair.

In [66]:
# Using the delete!() function
delete!(dictionary5, :A)

Dict{Symbol,Int64} with 2 entries:
  :B => 305
  :C => 1000

We can list both the keys and the values in a dictionary.

In [67]:
# The keys of a dictionary
keys(dictionary5)

Base.KeySet for a Dict{Symbol,Int64} with 2 entries. Keys:
  :B
  :C

In [68]:
values(dictionary5)

Base.ValueIterator for a Dict{Symbol,Int64} with 2 entries. Values:
  305
  1000

Through the use of iteration, we can get create in the creation and interrogation of a dictionary.

In [69]:
# Creating a dictionary with automatic keys
procedure_vals = ["Appendectomy", "Colectomy", "Cholecystectomy"]
procedure_dict = Dict{AbstractString,AbstractString}()
for (s, n) in enumerate(procedure_vals)
    procedure_dict["x_$(s)"] = n
end

In [70]:
procedure_dict

Dict{AbstractString,AbstractString} with 3 entries:
  "x_1" => "Appendectomy"
  "x_2" => "Colectomy"
  "x_3" => "Cholecystectomy"

In [71]:
# Iterating through a dictionary by key and value
for (k, v) in procedure_dict
    println(k, " is ",v)
end

x_1 is Appendectomy
x_2 is Colectomy
x_3 is Cholecystectomy


Lastly, we can sort using iteration.

In [72]:
# Sorting
dictionary6 = Dict("a"=> 1,"b"=>2 ,"c"=>3 ,"d"=>4 ,"e"=>5 ,"f"=>6)
# Sorting using a for loop
for k in sort(collect(keys(dictionary6)))
    println("$(k) is $(dictionary6[k])")
end

a is 1
b is 2
c is 3
d is 4
e is 5
f is 6


In [73]:
a = [[1, 2, 3] [4, 5, 6] ]
b = [1,2,3]

3-element Array{Int64,1}:
 1
 2
 3

In [74]:
transpose(a) * b

2-element Array{Int64,1}:
 14
 32

In [76]:
transpose(a)

2×3 LinearAlgebra.Transpose{Int64,Array{Int64,2}}:
 1  2  3
 4  5  6

In [80]:
size(repeat([1, 2], 3))

(6,)

In [81]:
typeof(("A", 3, "B", 4, "C", 2))

Tuple{String,Int64,String,Int64,String,Int64}