Temporarily processing a numpy binary file, analysis know inside is dict type, simple small remember, if NumPy and Python Foundation unfamiliar can see I wrote earlier article
In [1]:
%% Time import NumPy as NP
Wall time:135 ms
In [2]:
%% time Import pandas as PD
Wall time:351 ms
In [3]:
%% Time df = PD. DataFrame (Np.load ("Data.npy")) # Create DataFrame with Narry
Wall time:910 ms
In [4]:
%% Time Df.head (10) # Quick Preview Top 10 lines
Wall time:1 ms
OUT[4]:
|
0 |
0 |
{' Email ': ' [email protected] ', ' pwd ': ' 9755dd0556 ... |
1 |
{' Email ': ' [email protected] ', ' pwd ': ' 6bb518d1a42 ... |
2 |
{' Email ': ' [email protected] ', ' pwd ': ' 0079abba6 ... |
3 |
{' Email ': ' [email protected] ', ' pwd ': ' e23e561f02 ... |
4 |
{' Email ': ' [email protected] ', ' pwd ': ... |
5 |
{' Email ': ' [email protected] ', ' pwd ': ' 9b084 ... |
6 |
{' Email ': ' [email protected] ', ' pwd ': ' 7d07 ... |
7 |
{' Email ': ' [email protected] ', ' pwd ': ' 448a2 ... |
8 |
{' Email ': ' [email protected] ', ' pwd ': ' DBF ... |
9 |
{' Email ': ' [email protected] ', ' pwd ': ' 22ddd26d ... |
In [5]:
%% Time # Extract email column df[' email ' = Df[0].map (lambda x:dict (x) ["Email"]) # extract PWD column df[' MD5 '] = Df[0].map (lambda x:dict (x ) ["pwd"]) # Delete useless column del df[0]
Wall time:1.05 S
In [6]:
%% Time Df.size # See how much data is in total
Wall time:0 NS
OUT[6]:
2097148
In [7]:
%% Time Df.shape
Wall time:0 NS
OUT[7]:
(1048574, 2)
In [8]:
%% Time Df.head (10)
Wall time:0 NS
OUT[8]:
|
Email |
MD5 |
0 |
[Email protected] |
9755dd05564ead9eadcace40b5a02711 |
1 |
[Email protected] |
6bb518d1a42f22da5ca62d5ee41c5d4f |
2 |
[Email protected] |
0079abba66856dafdf2b9a6e0db23a09 |
3 |
[Email protected] |
E23e561f0202aceca30b8f07a48ab8e9 |
4 |
[Email protected] |
0eb1a2db91a2bf3fb6275de659a25805 |
5 |
[Email protected] |
9b08473c992c07e98389ed1c280a634a |
6 |
[Email protected] |
7d0710824ff191f6a0086a7e3891641e |
7 |
[Email protected] |
448a2bcee09a3b14c22dc000351216b7 |
8 |
[Email protected] |
Dbfba02e366bab58df605d6475189a51 |
9 |
[Email protected] |
22ddd26d62af8b1c4a216be18fdff5b2 |
In [9]:
%% time DF. T.to_json ("User.json") # re-saved as JSON (transpose just to store it in our common JSON format)
Wall time:2.85 S
Temporary handling: Convert numpy narray binary files to JSON files