The de facto languages of Big Data and data science are
Other languages include
In our course, we will be using Scala and Python.
Scala
object Hello extends App {
println("Hello, world")
for (i <- 1 to 10) {
.out.println("Hello")
System}
}
{
and }
Python
#!/usr/bin/env python3
for i in range(1, 10):
print("Hello, world")
Scala
val a: Int = 5
val b = 5
= 6 // re-assignment to val
b
// Type of foo is inferred
val foo = new ImportantClass(...)
var a = "Foo"
= "Bar"
a = 4 // type mismatch a
val
s are single-assignment, var
s are multiple assignmentPython
int = 5
a : = "Foo"
a
= ImportantClass(...) a
Scala
def max(x: Int, y: Int): Int =
if (x >= y) x else y
Python
def max(x : int, y : int) -> int:
if x >= y:
return x
else:
return y
Scala
def bigger(x: Int, y: Int,
: (Int,Int) => Boolean) =
ff(x, y)
bigger (1, 2, (x, y) => (x < y))
bigger (1, 2, (x, y) => (x > y))
// Compile error
bigger (1, 2, x => x)
Python
def bigger(x, y, f):
return f(x, y)
1,2, lambda x,y: x > y)
bigger(1,2, lambda x,y: x < y)
bigger(# Runtime error
1,2, lambda x: x) bigger(
bigger
is a higher-order function, i.e. a function whose behaviour is parametrised by another function. f
a function parameter. To call a HO function, we need to pass a function with the appropriate argument types. The compiler checks this in the case of Scala.
Scala
class Foo(val x: Int,
var y: Double = 0.0)
// Type of a is inferred
val a = new Foo(1, 4.0)
println(a.x) //x is read-only
println(a.y) //y is read-write
.y = 10.0
aprintln(a.y) //y is read-write
.y = "Foo" // Type mismatch, y is double a
val
means a read-only attribute. var
is read-writePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
= Foo(3,2)
a print a.x
= "foo"
a.x print a.x
Scala
class Foo(val x: Int,
var y: Double = 0.0)
class Bar(x: Int, y: Int, z: Int)
extends Foo(x, y)
trait Printable {
val s: String
def asString() : String
}
class Baz(x: Int, y: Double, private val z: Int)
extends Foo(x, y) with Printable {
override val s: String = s
override def asString(): String = ???
}
Python
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
class Bar(Foo):
def __init__(self, x, y, z):
__init__(self, x, y)
Foo.self.z = z
In both cases, the traditional rules of method overriding apply. Traits in Scala are similar to default interfaces in Java > 9; in addition, they can include attributes (state).
Scala
case class Address(street: String,
: Int)
numbercase class Person(name: String,
: Address)
address
val p = Person("G", Address("a", 2))
Python >= 3.7
from dataclasses import dataclass
@dataclass
class Address:
str
street: int
number:
@dataclass
class Person:
str
name:
addr: Address
= Person("G", Address("a", 2)) p
Data classes are blueprints for immutable objects. We use them to represent data records. Both languages implement equals
(or __eq__
) for them, so we can compare objects directly.
Pattern matching is if..else
on steroids
// Code for demo only, won't compile
match {
value // Match on a value, like if
case 1 => "One"
// Match on the contents of a list
case x :: xs => "The remaining contents are " + xs
// Match on a case class, extract values
case Email(addr, title, _) => s"New email: $title..."
// Match on the type
case xs : List[_] => "This is a list"
// With a pattern guard
case xs : List[Int] if xs.head == 5 => "This is a list of integers"
case _ => "This is the default case"
}
This is by far not an introduction to either programming languages. Please read more here
This work is (c) 2017, 2018, 2019, 2020, 2021 - onwards by TU Delft and Georgios Gousios and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.