In this course, we will learn
- Programming techniques, specifically functional programming, useful for Big Data Systems.
- Basic distributed systems theory, applied on databases and file systems
- “Small” data processing
- Batch processing
- Stream processing
- Graph processing
Schedule
You can get information about the course schedule here
Grades
- 60% final exam, multiple-choice and some open-ended questions.
- 40% assignments. 5 assignments, 8% each.
There will be a resit, there is no mid-term.
You can transfer your assignment grade to the resit AS A WHOLE. No individual assignment resubmissions!
Assignments
The course consists of 5 mandatory assignments.
You always work in pairs: select your team mate!
Grade: \(\frac{\sum_{assign=1}^{5} grade(assign)}{5}\), aka the mean
- No submission? Grade \(\rightarrow\) 0
- Late submission? Grade \(\rightarrow\) 0
We will use Unix for the first assignment, IntelliJ/Scala for all others.
Template files are provided on Brightspace
- Use Maven for assignments 2, 3, 4 and 5 to install all dependencies
Assignment submission
We will use CPM. The course name is CSE2520: Big Data Processing
- You need to signup to enroll and also declare your pairs
- To submit, hit the overview button and select the appropriate assignment
- Pay attention to the instructions about submitting an assignment. Deviating from these may cause automatic grading to fail
- All the assignments have deadlines
- Feedback and grading is automatic: the results are available on CPM
- Technical support: ask the TAs during the labs or via Mattermost
- Feedback on CPM is not instantaneous. Do let the TAs know if no feedback appeared after an hour
Need help?
- Lab sessions every Thursday afternoon
- All TAs will be around
- Work together to solve assignments
- Ask questions about the next assignment
- Ask questions about your grades
- You are welcome to join the BDP 2021-2022 Mattermost channel.
Rules for using Mattermost
- Fire and forget
- Don’t expect TAs to answer
- Don’t expect the course instructors to answer questions
- TAs and instructors do not work outside business hours
- Mattermost should be mostly used to exchange opinions and ideas
- No abusive behavior (swearing, stalking, threatening, sexism etc) will be tolerated
Slide symbols
Q A question with a known answer; this will be revealed, but we should work together towards it!
D A open discussion item; we need to think and discuss.
Lecture notes
Freely available on the web
You can print/download them before the lecture and bring them along to make additional notes.
We are looking forward to improve them! If you have suggestions, find errors etc, let me know!