Please read the following information carefully.
This exam consists of 16 multiple choice questions (16 pts), 3 open ended questions (7 pts), and 3 programming questions (9 pts).
The points for the multiple-choice part of the exam are computed as: \[\#correct\_answers - (\#incorrect\_answers / 3)\] This accounts for a 25% guessing correction, corresponding to the four-choice questions we use.
You have 2 hours to complete this exam.
Before you hand in your answers, check that the sheet contains your name and student number, both in the human and computer-readable formats.
The use of the book, notes, calculators or other sources is strictly prohibited.
Note that the order of the letters next to the boxes on your multiple-choice sheet may not always be A-B-C-D! Tip: mark your answers on this exam first, and only after you are certain of your answers, copy them to the multiple-choice answer form.
Read every question properly and in the case of the open questions, give all information requested, which should always include a brief explanation of your answer. Do not however give irrelevant information - this could lead to a deduction of points.
Note that the minimum score per (sub)question is 0 points.
You may write on this exam paper and take it home.
Exam is © 2021 TU Delft.
The command ls -l
lists the file
runTests.py
as follows:
-rwxr-xr-x 1 alice alice 1728 Nov 21 10:32 runTests.py
Which of the following is not correct for the
runTests.py
?
alice
alice
Which of the pipelines lists duplicate line (which are repeated
more than once) in the file file.txt
together with the
number of their occurrences? We assume that a line cannot be repeated by
ten or more times.
Reminder on the options of uniq
command:
-c
prefixes lines by the number of occurrences
-d
only prints duplicate lines, one for each group
sort file.txt | uniq -c | sort -n | tr -s ' ' | grep "^ [1-9]* .*"
sort file.txt | uniq -c | sort -n | tr -s ' ' | grep "^ [2-9]* .*"
sort file.txt | uniq -d | sort -n | tr -s ' ' | grep "^ [1-9]* .*"
sort file.txt | uniq -d | sort -n | tr -s ' ' | grep "^ [2-9]* .*"
What do the following lines evaluate?
val list = List(1,2,3,4)
val unknown = list.foldRight(Nil:List[Int])(_ :: _)
List(4, List(3, List(2, List(1))))
List(1,2,3,4)
List(4,3,2,1)
List(1, List(2, List(3, List(4))))
Monad
interface in Scala:trait Monad[M[_]] {
def unit[S](a: S) : M[S]
def flatMap[S, T] (m: M[S], f: S => M[T]) : Monad[T]
}
trait Monad[M[_]] {
def unit[S](a: S) : M[S]
def map[T] (m: M[S], f: S => M[T]) : Monad[T]
}
trait Monad[M[_]] {
def map[T] (m: M[S], f: S => M[T]) : Monad[T]
def reduce[T,B](init: B, f: (B,T) => M[B]): Monad[B]
}
trait Monad[M[_]] {
def map[T] (m: M[S], f: S => M[T]) : Monad[T]
def flatMap[S, T] (m: M[S], f: S => M[T]) : T
}
reduce
fold
aggregate
mapPartitions
mapValues
is an example of a function that
requires shufflingIn a system with 3 nodes A, B, C, six events took place in the following order:
msg1
to Bmsg2
to Amsg1
msg2
msg3
to Cmsg3
Which pair of events are not causally related (i.e., they are concurrent) to each other? (You can compute vector clocks for checking causal dependencies.)
We’re designing a system that replicates a sequence of user commands on a geographically distributed set of nodes. Our system ensures that all previous commands are replicated to all the nodes before processing a new one.
Given that our system runs on an unreliable network, which of the properties cannot always be satisfied by our system?
What will be produced by the following Flink code snippet:
.map(c => (c.id, 1)).keyBy(x => x._1).sum(1) dataStream
x
whose initial value is 0
. The operations in the history are
labelled with A-G. Is the given execution history linearizable? If yes,
provide a valid linearization. If not, provide an example operation that
violates linearizability.
Answer:
It is not linearizable.
The operation E cannot read x=1 after the operation G which takes effect before E and reads x=2.
It would be linearizable if G reads x=1 or E reads x=2
(3pts) Given the following Spark program, write
the types of the variables a
,
b
, and c
.
case class Student(sid: Int, sname: String)
case class Enrollment(sid: Int, cid: Int)
val students: RDD[Student] = // reads students from a file into an RDD
val enrollments: RDD[Enrollment] = // reads enrollments from a file into an RDD
val a = students.map(s => (s.sid, s.sname))
val b = enrollments.groupBy(_.sid).mapValues(_.size)
val c = a.join(b)
Answer:
val a: RDD[(Int, String)]
val b: RDD[(Int, Int)]
val c: RDD[(Int, (String, Int))]
Answer:
Wide transformations are transformations that involve a shuffle of the data between the partitions.
Example: `groupByKey()`, `join()`
Narrow transformations are transformations for which each input partition will contribute to only one output partition and they do not require shuffle between partitions.
Example: `map()`, `filter()`
See the FP practice questions in Weblab.