Origin
"China has hip-hop" This program has attracted countless eyes this summer, but also let hip-hop into the public vision. As the only variety show I have seen this year, it has had a great impact on me. This summer, I basically spent in Hangzhou, in the taxi on the commute, I almost all in the brush this program, finally can be said to have seen several times. At the same time, we also learned about flow, punchline, Diss, cool double-bet, three-bet and so on professional vocabulary.
Among these, I am particularly interested in rhymes. I was thinking, if, I take down all the rhyming words that the hip-hop singer sings, and then just give a key word, you can come to a rhyme of freestyle, is not very cool ~
Approximate idea
The whole idea is clear, that is
Song (lyrics)---word thesaurus, pinyin-to-rhyme, anytime, anywhere freestyle
NetEase Cloud Music website Crawl data
First into the NetEase Cloud Song single page, category select "Rap":
Can see the users to organize the rap of the song list are here ~ all the song single page crawl, Get said to sing a list, a total of 1275 songs list:
To enter one of the songs, there is a song page for the song list:
All songs on the song page crawl, get the song list, a total of 1w+ song
Get the SongID of each song, call the lyrics API (http://music.163.com/api/song/lyric?os=pc&id=509135896&lv=-1&kv=-1&tv=- 1) Get lyrics:
Of course, NetEase Cloud still did some anti-crawl strategy, just start testing, my real IP was pulled black. Returns a large number of 503:
Therefore, a group of proxy IP was found, and the crawler speed limit, bypassing the anti-crawling mechanism.
Crawl process GIF:
It's probably been a few hours, and 1w+ 's lyrics are all in hand!
Data processing (participle, pinyin parsing, get double-, three-bet)
Next is the lyrics of each song, using open-source stuttering participle for parallel participle, using the following examples:
You can see the results of the segmentation of this sentence is still satisfactory.
Stuttering participle can also be counted after a word participle:
In this way, a document can be segmented and counted. So I put each song to the participle, the results are thrown into the zset structure of Redis, the number of occurrences recorded as score. When I finished, I basically got all the thesaurus of all the songs in the whole hip-hop song list.
The next question is, how do you convert the words in the thesaurus to pinyin? Open source Xpinyin solves this problem by using the following examples:
Xpinyin identified the corresponding phonetic composition of the pot base.
So what is rhyme?
Take Jony-j's "routine" example, Stride (MAI-BU), Attitude (TAI-DU), exposed (WAI-LU), lead the Way (Dai-lu), wherein the rhyme is ai-u.
So how do I get the rhyme of the word? After observation, I found that I can use "Aeiou" as the boundary of the word, take the current and the following part for the rhyme (this is actually more stringent than the standard rhyme)
For example, Huo-guo-di-liao, Uo-uo-i-iao, hot pot base material
Word Cloud
Using the words in hip-hop music to create a word cloud, you can intuitively see what hip-hop singers are mainly singing:
Come on, freestyle.
Foreplay has done enough, now, I want to "fried rice" as the topic of a child freestyle (inspired by Pgone's fried noodles freestyle), so i first searched "fried Rice":
There are a lot of double-bet words, then I will use these words to a rap!
Video here, forward nuclear!
Https://v.qq.com/x/page/q05574kytoi.html
Fried rice Freestyle (pgone style, I am the most cock, you are all idiots)
Beat:rap God (Instrumental)
Yo yo yo, whatsup.
This is the MC young.
Fried Rice Freestyle
LISTEN.
Drop the beat DJ.
Yo, Yo, yo, yo, listen, listen, listen, check it.
Ready, ready, ready, ready, ready
Don't listen to the rumors,
It's just a test for you.
I'm the director of this show.
AKA Your boss. Ya ya
If you try to be a troublemaker,
Old man of three years,
Send you a luxurious package,
Go home and eat fried rice.
Yeah dude's verse is so productive,
Yeah not like those pirate hater annoying,
It's not good to see, only mischief,
And trying to Gao pan that patron, ya Ya
It's hard not to climb,
I was most dazzling when I found the opposite. SKr SKr
Eh, you asked me which one,
Look at me,
I am a righteous man,
Tongluowan's Chenhaona. Punchline
Double-Bet *19 reached!
This rap uses Final cut Pro X to cut the video and do the subtitles. The first time to do video, we forgive haha.
Digging pits
Dig a hole, next time write an input word, can realize the word as the core automatically generated a paragraph meaning roughly fluent rhyme of the function of the rap.
Please pay attention to my public number "Luyang thought" ~
Sweep
Follow the public number
I use Python to crawl the hip-hop song on NetEase cloud music and analyze how rapper rhymes