Printing Clusters (Top terms & titles)

I've followed all the steps down to the final one where you print the top terms per cluster, together with the film titles.
I'm using a slightly different dataset (blog titles and blog post content) but in essence my data is the same as yours, although my data is already in a dataframe, so where you call on 'synopses', I call df.Content. The one step I couldn't do was the one where you grouped the rank by clusters as obviously this doesn't apply to me.  I want ten clusters from my data.

Here, you create a dictionary:

films = { 'title': titles, 'rank': ranks, 'synopsis': synopses, 'cluster': clusters, 'genre': genres }
frame = pd.DataFrame(films, index = [clusters] , columns = ['rank', 'title', 'cluster', 'genre'])

But as I already have a dataframe, I reindex-ed using clusters.  The problem is, only the first ten blog post titles are being used, as this screenshot shows:

![image](https://cloud.githubusercontent.com/assets/14995505/21614948/799b7768-d1d3-11e6-921f-0804c5bb7d61.png)

As this is my first attempt at kMeans (although I've been experimenting with my data for three weeks) I'm not yet clever enough to work out what's going wrong.  Any ideas?  Thanks in advance!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Printing Clusters (Top terms & titles) #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Printing Clusters (Top terms & titles) #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions