# ICS 691: Advanced Data Structures (Spring 2016)

Instructor: Nodari Sitchinava
Email: nodari (at) hawaii (dot) edu
Location: Holmes Hall 242
Time: Monday, Wednesday 9:00-10:15am
Office Hours: Monday 10:30-11:30 in POST 309C

## Course Description

The study of data structures studies the problem of preprocessing and organizing large data that mostly does not change over time such that we can answer queries on this data efficiently. You encounter these types of problems more often than you think: think of interactions with Google search (return a document among many that contains a query text), finding directions (return the shortest path between two points on a large map that mostly does not change), querying employee databases (return all employees that satisfy certain criteria, e.g. fall within some salary and age ranges), etc.

There have been many recent results in the area of data structures with many more problems remaining open. The course will provide an overview of existing data structuring results, present in-depth understanding of some of the specific results, teach specific data structuring techniques, and provide the students with an opportunity to tackle some of the open problems.

To get a flavor of the types of problems studied in this course, consider the following questions:

• Did you know that the best known algorithm for sorting arbitrary integers runs in O(n √log log n) time?!
• You probably learned in the undergraduate algorithms course that you can use a priority queue to sort data, but did you know that any sorting algorithm that runs in O(n * f(n)) time can be converted into a priority queue data structure with O(f(n)) time per operation?
• You also probably learned in an undergraduate algorithms class that Dijkstra algorithm is the fastest way to find the shortest path between two nodes on a graph. But did you know that if Google simply implemented Dijkstra algorithm in Google Maps, each time you asked Google to find the shortest route between two points on the map of the US, it would take minutes, and not milliseconds (as it does now) to return the answer?

## Registering for the course

I am traveling during the first week of classes and the following Monday is a holiday. So the first "real" lecture will not be until after the add date, if you are considering registering for the course after attending the first couple of lectures. Therefore, if you are wondering if this course is for you, I suggest you email me as soon as possible.

During the first week (in lieu of the lectures), there will be a take-home assessment exam, which is posted on this webpage. I will use the results of the exam to determine students' level of preparation for the prerequisites and to adjust the course and topics approriately. Therefore, to give an accurate picture of students' knowlege, the exam should be individual effort.

## Prerequisites

• ICS 311, undergraduate algorithms course or permission of the instructor.

This is a course on advanced algorithmic concepts, so you should be very comfortable with asymptotic notation, design and analysis of basic algorithms, and the algorithms taught in ICS 311 (most of the material from the CLRS textbook). Therefore, unless you received an 'A' in ICS 311 or equivalent, I suggest you talk to me before registering for the course.

There is no textbook for the material in this course because the material consists mostly of recent research results. The articles covering the material will be posted here.

The grade in the course will consist of the following components:

• Attendance and Scribing (20%): You must attend lectures. You will be asked to take notes during class and typeset lecture notes in Latex for a several classes (exact number will depend on the number of registered students)
• Homework (40%): There will be 3-5 homeworks throughout the semester
• Project (40%): You will be expected to complete a research project (see below for a description) and give a 30-minute presentation about your project at the end

## Scribing

Content. The notes you write should cover all the material covered during the relevant lecture, plus real references to the papers containing the covered material. Your notes should be understandable to someone who has not been to the lecture. You should write in full sentences where appropriate; point form (like I write on the board) is often too terse to follow without a sound track (though occasionally it is appropriate). Use numbered sections, subsections, etc. to organize the material hierarchically and with meaningful titles. If you feel it is appropriate, use nested bullets to organize material hierarchically even deeper. Try to preserve the motivation, difficulties, solution ideas, failed attempts, and partial results obtained along the way in the actual lecture.

Format. Write your notes using LaTeX, with figures in Encapsulated PostScript (generated from xfig, ipe, Adobe Illustrator or whatever you want). Start from the Latex template, which sets the style.

Timing. Try to write the lecture notes for a class on the same day while the material is fresh in your mind and it will save you time. You should finish the first draft of your notes and send it to me by two days after the lecture. Then I'll either send you comments via email or we'll schedule a meeting to go over your write-up, I'll make suggestions, you'll make a second pass, and send it to me. I'll make the final pass, and post it on the webpage. The goal will be to get the notes out by one week after the corresponding class.

## Homework

There will be 3-5 homeworks (once every 2-3 weeks) throughout the semester. Each homework will contain as many problems as there are students in class. You may collaborate with anyone on the homeworks, but you must write up your own solutions. You must write the names of everyone you discussed the solution with in your homework write-up. The homework must be typed up. It will be due at the beginning of the lecture on the day it is due and can be either submitted in person in lecture (preferred) or emailed to me. Homework solutions will be discussed in class on that day, therefore, no late homeworks will be accepted (even if you miss that lecture).

## Project

Goal. Ideal outcome of the project at the end of the class is for you to obtain results that can be published at an algorithms conference. To receive full credit on the project, you do not have to achieve this goal (that's the nature of research), but that should be your goal. If you do not achieve publishable results, your write-up should describe the ideas and approaches you took to solve the problem.

Topic. The topic of your research project should be related to data structures. I will be available for brainstorming during office hours for possible topic of interest. You must be interested in the topic, but I must approve the topic, so check with me first.

Format. Here is a list of possible formats of the project. This list is not exhaustive, so if you have an interesting idea that you don't see on the list below, come discuss it with me.

• Learn about an open problem and try to solve it (and hopefully solve it).
• Implement one or more related data structures and experiment with them. Example of outcomes: Understanding the behavior of a data structure in practice; comparing two or more related data structures in practice; determining whether a previously unimplemented data structure is practical; finding a way to optimize the implementation of a data structure to be practical, e.g. by optimizing the use of cache hierarchy or through the use of special instructions of the specific architecture.
• Invent a good open problem. Here, the main contribution is posing the problem itself. But you should think about how it could be solved as well.
• Write up a set of related topics in more detail. This should be more extensive than the write up for lecture notes. The writeup should read like a book chapter, i.e., it should provide more extensive introduction, motivation and literature review of related results for the problems discussed, present all details, proofs and a full set of bibliographical references. Ideally, it should conclude with several exercises on the topic and/or a list of open research problems. If you decide to take on this project, talk to me first to agree on the set of topics that should be covered by your write-up. This type of project might potentially require longer than the 15-page limit.

Write-up. The project must be written up in a research paper format. It should be somewhere between 6 to 15 single-spaced pages with 1 inch margins. It should start with a title, author and a 1-2 paragraph abstract. The body of the write-up should consist of introduction, the body and the conclusions. The introduction should describe the problem you are addressing, present a brief literature review of related results on the topic, and a summary of your results. The body should describe your solution, teachnique/approach to solving it and results. If you haven't achieved significant results, you should still describe the techniques/approaches you have tried and why they didn't work. The conclusions should summarize what you have presented and present possible directions for future research, e.g. open problems that remain unsolved and/or possible approaches that you might have tried if you had more time. You are welcome to collaborate on the project with anyone (even outside the class), including me, but you should give credit to people you have collaborated with. This is the nature of research.

Presentation. At the end of the semester you should give a 30 minute presentation about your project.

## Topics

The class will cover a subset of the following topics:

• Persistence (partial, full, retroactive)
• Search indexing and search trees
• Dictionaries
• Text indexing
• Geometric data structures
• Integer data structures
• Static data structures
• Data structures for memory hierarchies (cache-efficient data structures)
• Succinct data structures (when space is limitted)
• Lower bounds for data structures
• Dynamic graphs and distance oracles
• Kinetic data structures
• Shortest paths

Specific topic covered will depend on the students' interest. The schedule below will be updated with the topics as they are covered.