I use Python to crawl the hip-hop song on NetEase cloud music and analyze how rapper rhymes

Source: Internet
Author: User

Origin

"China has hip-hop" This program has attracted countless eyes this summer, but also let hip-hop into the public vision. As the only variety show I have seen this year, it has had a great impact on me. This summer, I basically spent in Hangzhou, in the taxi on the commute, I almost all in the brush this program, finally can be said to have seen several times. At the same time, we also learned about flow, punchline, Diss, cool double-bet, three-bet and so on professional vocabulary.

Among these, I am particularly interested in rhymes. I was thinking, if, I take down all the rhyming words that the hip-hop singer sings, and then just give a key word, you can come to a rhyme of freestyle, is not very cool ~

Approximate idea

The whole idea is clear, that is

Song (lyrics)---word thesaurus, pinyin-to-rhyme, anytime, anywhere freestyle

NetEase Cloud Music website Crawl data

First into the NetEase Cloud Song single page, category select "Rap":

Can see the users to organize the rap of the song list are here ~ all the song single page crawl, Get said to sing a list, a total of 1275 songs list:

To enter one of the songs, there is a song page for the song list:

All songs on the song page crawl, get the song list, a total of 1w+ song

Get the SongID of each song, call the lyrics API (http://music.163.com/api/song/lyric?os=pc&id=509135896&lv=-1&kv=-1&tv=- 1) Get lyrics:

Of course, NetEase Cloud still did some anti-crawl strategy, just start testing, my real IP was pulled black. Returns a large number of 503:

Therefore, a group of proxy IP was found, and the crawler speed limit, bypassing the anti-crawling mechanism.

Crawl process GIF:

It's probably been a few hours, and 1w+ 's lyrics are all in hand!

Data processing (participle, pinyin parsing, get double-, three-bet)

Next is the lyrics of each song, using open-source stuttering participle for parallel participle, using the following examples:

You can see the results of the segmentation of this sentence is still satisfactory.

Stuttering participle can also be counted after a word participle:

In this way, a document can be segmented and counted. So I put each song to the participle, the results are thrown into the zset structure of Redis, the number of occurrences recorded as score. When I finished, I basically got all the thesaurus of all the songs in the whole hip-hop song list.

The next question is, how do you convert the words in the thesaurus to pinyin? Open source Xpinyin solves this problem by using the following examples:

Xpinyin identified the corresponding phonetic composition of the pot base.

So what is rhyme?

Take Jony-j's "routine" example, Stride (MAI-BU), Attitude (TAI-DU), exposed (WAI-LU), lead the Way (Dai-lu), wherein the rhyme is ai-u.

So how do I get the rhyme of the word? After observation, I found that I can use "Aeiou" as the boundary of the word, take the current and the following part for the rhyme (this is actually more stringent than the standard rhyme)

For example, Huo-guo-di-liao, Uo-uo-i-iao, hot pot base material

Word Cloud

Using the words in hip-hop music to create a word cloud, you can intuitively see what hip-hop singers are mainly singing:

Come on, freestyle.

Foreplay has done enough, now, I want to "fried rice" as the topic of a child freestyle (inspired by Pgone's fried noodles freestyle), so i first searched "fried Rice":

There are a lot of double-bet words, then I will use these words to a rap!

Video here, forward nuclear!

Https://v.qq.com/x/page/q05574kytoi.html

Fried rice Freestyle (pgone style, I am the most cock, you are all idiots)

Beat:rap God (Instrumental)

Yo yo yo, whatsup.

This is the MC young.

Fried Rice Freestyle

LISTEN.

Drop the beat DJ.

Yo, Yo, yo, yo, listen, listen, listen, check it.

Ready, ready, ready, ready, ready

Don't listen to the rumors,

It's just a test for you.

I'm the director of this show.

AKA Your boss. Ya ya

If you try to be a troublemaker,

Old man of three years,

Send you a luxurious package,

Go home and eat fried rice.

Yeah dude's verse is so productive,

Yeah not like those pirate hater annoying,

It's not good to see, only mischief,

And trying to Gao pan that patron, ya Ya

It's hard not to climb,

I was most dazzling when I found the opposite. SKr SKr

Eh, you asked me which one,

Look at me,

I am a righteous man,

Tongluowan's Chenhaona. Punchline

Double-Bet *19 reached!

This rap uses Final cut Pro X to cut the video and do the subtitles. The first time to do video, we forgive haha.

Digging pits

Dig a hole, next time write an input word, can realize the word as the core automatically generated a paragraph meaning roughly fluent rhyme of the function of the rap.

Please pay attention to my public number "Luyang thought" ~

Sweep
Follow the public number

I use Python to crawl the hip-hop song on NetEase cloud music and analyze how rapper rhymes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.