Collective Programming - A Chinese Website Passages Classification

Abstract

This project is adapt from the book Programming Collective Intelligence Building Smart Web 2.0 Applications, while the example codes and resources are expired. So I designed a self-studied API thoroughly and made the clustering algorithm in total Chinese environment. The exampled API I made is based on web scrawling from ZhiHu, the source code could be viewed from here. The basic algorithm is clustering, with hierarchical and K-Mean Clusterings. I finally made a simple application on a Chinese movie website (like IMDB) douban, which could recommend the books for the people who have the similar preference.

Date
Jul 1, 2017 12:00 AM
Avatar
Dengpan Yuan
Columbia@MSCS

I’m a MSCS student at Columbia University@SEAS.