Given a sequence of nodes and their neighboring similarities, can you segment the nodes automatically?
This a basic task in data science and fundamental to many downstream tasks such as DNA analysis, movie scene analysis, music retrieval and text summary. Numerous methods have been proposed to attack the problem. However, most of them require users to specify the number of segments, which is usually unknown in practice. In this project we will try to overcome this by using clusterability based on information divergence. In this way we can choose the number of segments automatically. Afterwards, we can recursively subdivide each segment if needed according the clusterability criterion. At the end we will obtain segmentation hierarchy, which can greatly facilitate the understanding and analysis of sequences.
This is project on basic research. The student(s) in this project should welcome the challenge to make breakthrough in the machine learning essence. Good programming and university mathematics are required to have fun the project.