Ti Communication Development Kit ndk

Source: Internet
Author: User
Tags network function sendmsg htons dmps
Document directory
  • 3. Efficiency Test and Performance Analysis of UDP data packets transmitted by ndk

Efficient design of ndk Development Kit

To accelerate the network process of its high-end DSP, TI has launched the TCP/IP ndk (Network developer's kit) development kit in combination with its C6000 series chips.

The main components of ndk include: (1) support for TCP/TP protocol stack libraries. These libraries mainly include: libraries that support TCP/IP network tools and libraries that support TCP/IP protocol stacks and DSP/BIOS platforms, network Control and thread scheduling Library (including protocol stack initialization and network-related task scheduling) (2) demonstration program. It mainly includes DHCP/telnet client and HTTP/Data Server demonstration. (3) Support documents include user manual, programmer manual, and platform adaptation manual.

Ndk adopts a compact design method to support TCP/IP with less resource consumption. From the practical results, ndk only uses 200 ~ The K program space and the 95k data space can support conventional TCP/IP Services, including telnet, DHCP, and HTTP at the application layer. To minimize resource consumption, Ti adopts many special techniques for its ndk, including: (1) UDP socket and raw socket do not use the sending or receiving buffer; (2) the TCP socket uses the sending buffer, and the receiving buffer depends on the configuration file. (3) data is transmitted between the low-level driver and the protocol stack through a pointer, and packets are not copied and copied; 4. Set a special thread to clear fragments in the memory and check for memory leaks. Therefore, ndk is suitable for the current hardware environment of embedded systems and an important supporting tool for realizing DSP network communication.

The software development environment of ndk is TI's development tool CCS (Code Composer Studio ). It includes Real-Time OS DSP/bios and rtdx, a real-time data exchange software between the host and the target board.

2.2 ndkConfiguration and use

When using ndk in CCS, special processing should be performed on the following points:

(1) set DSP/BIOS

Set the primary clock for the PRD. The clock driver at the hardware abstraction layer requires a Ms start PRD function as the main clock. The function name is lltimertick ().

Hook sets the storage space for the TCP/IP protocol stack. The task scheduling module of the OS library needs to call the hook to save and call the environment variable pointer of the TCP/IP protocol stack. The two hook functions are ndk_hookinit () and ndk_hookcreate ().

(2) include files and library files

Note that the library file and file path must be included during compilation. The default value is C:/Ti/C6000/ndk/Inc.

(3) connection sequence during CCS project Compilation

CCS generally links the target functions and library files in a specific order. ndk is very sensitive to the order of the links. Wrong order can lead to repeated symbols or even incorrect execution. To avoid this problem, you can select the link order "->" build Options dialog box in CCS to add the files in a certain order and add the library files to the end of the connection order, the recommended sequence is: netctrl. lib, hal_xxx.lib, nettool. lib, stack. lib and OS. lib.

Before initiating the startup protocol stack, assign a working memory (SDRAM) to it. The command is _ mmbulkallocseg (extern1 ). You also need to call fdopensession () to initialize the file pointer vector table. Otherwise, an error occurs during socket creation.

We define the sending/receiving settings as a task. Before creating a task handle, we should use nc_systemopen () to enable the network function and set it, before the system is shut down, perform corresponding processing.

Note the following issues when using the socket API functions provided by ndk: (1) the socket API in ndk is connected to the operating system through a file pointer interface, therefore, you must call the file pointer vector table initialization and function closure to perform corresponding operations on the file system. (2) ndk does not provide powerful select functions in Windows APIs, but fdselect can be used to implement some corresponding projects. The API functions can correspond to fdclose and standard close in ndk, fderror and standard errno in ndk. (3) ndk provides functions supported by many network tools, such as DNS-related functions, which can replace getpeername and gethostname in standard APIs. In addition, some IGMP functions can be used to support multicast, but only serve as multicast users and cannot serve as multicast servers.

3. Efficiency Test and Performance Analysis of UDP data packets transmitted by ndk

3.1Test Platform Structure

We have studied the efficiency of the CPU in the ndk for sending and receiving UDP data packets. This test is divided into two parts: one is to test the UDP data packet sent from the DM642 to the PC, when different transmission rates and CPU usage of different L2 cache sizes are used, the other part is to test when the DMPS receive data packets sent from the PC, CPU usage at different transmission rates and different L2 cache sizes. The tools we use are the socket API functions provided by ndk under CCS and the winsocket API provided under Visual Studio. Figure 4 shows the test environment.

 

Figure 4: ndk test environment

3.2 Configuration and implementation of the test platform

Since the receiving and sending programs are very similar, we only use the sending program as an example. Create a program for sending data as a task. In DSP/bios, the task object is the thread managed by the tsk module. The tsk module dynamically schedules tasks based on their priorities and the current execution status. The DSP/BIOS has a total of 15 task priorities available, and provides a set of functions to manipulate task objects, including creating, deleting, and setting task objects. Any task object is in one of the following States: running, ready, blocked, and terminated.

In this project, we create tasks in the Network Control Program. Figure 5 shows the flow chart for creating tasks:

 

Figure 5 transfer task creation Flowchart

The statement for creating a task is taskcreate (tsk_udp, "udp_video", 5, 0x1000, peer_addr, 12345,123 45 ). Theoretically, you can increase the data transmission rate by setting two tasks, but note that the two tasks should be transmitted through different ports. The application for task scheduling is:

Static void tsk_udp (IPN ipaddr, int peerport, int localport)

{......

// Create a socket

S = socket (af_inet, sock_dgram, ipproto_udp );

......

// Set the address port attribute to bind

Bzero (& sin1, sizeof (struct sockaddr_in ));

Sin1.sin _ family = af_inet;

Sin1.sin _ Len = sizeof (sin1 );

Sin1.sin _ Port = htons (localport );

// Bind the IP address and port

If (BIND (S, (PSA) & sin1, sizeof (sin1) <0)

{Goto exit_tsk ;}

// Set the destination address and port attributes

Bzero (& sin1, sizeof (struct sockaddr_in ));

Sin1.sin _ family = af_inet;

Sin1.sin _ Len = sizeof (sin1 );

Sin1.sin _ ADDR. s_addr = ipaddr;

Sin1.sin _ Port = htons (peerport );

......

// Allocate a working Buffer

If (! (Pbuf = mmbulkalloc (1024 )))

{Goto exit_tsk ;}

// Start sending data

For (;;)

{// Fill in the buffer for sending data

* (Int *) pbuf = send_udp_count ++

// Send data

If (sendto (S, pbuf, 1000, 0, & sin1, sizeof (sin1) <0)

{Goto exit_tsk; // break ;}

// Clear the data Zone

Mmzeroinit (pbuf, (uint) test );

// Set the sending data rate

Tasksleep (8); // 1 Mbit/s

}

......

}

There are two key parameters to be set in the test. One is the data rate of the sending (receiving) and the second-level cache size inside the DMPS. The data sending and receiving rate can be changed by changing the time interval of task suspension. The system function tasksleep (n) indicates that the message is sent every n milliseconds. We set 1000 bytes of data to be sent each time. In this way, tasksleep (8) indicates the transmission rate of 1 Mbit/s, and tasksleep (4) 2 Mbit/s transmission rate, and so on.

You can use the following statement to change the L2 cache size:

Cache_setl2mode (cache_64kcache) indicates that 64 K L2 cache is set; cache_setl2mode (cache_128kcache) indicates that 128 K L2 cache is set, and so on.

3.3 Test results and performance analysis

We use the standard recvfrom function to receive data in the evaluation version of the DMPS. the connectionless UDP protocol and Windows PC are used for mutual transmission. The CPU usage of the second-level cache with different transmission rates and different sizes is compared.

CPU usage = low-priority tasks that can be completed in idle periods/low-priority tasks that can be completed during transmission tasks

The received and sent data are set to 1000 bytes each time, and the evaluation results are displayed in the following four charts.

Transmission Rate (Mbit/s)

64 kcache

128 kcache

256 kcache

0.4

0.29

0.21

0.19

0.8

0.58

0.45

0.38

2

1.1

1.02

0.96

4

2.64

2.26

1.88

8

5.11

4.38

3.64

16

9.86

8.25

6.89

Table 1: CPU usage (%) of UDP data packets sent by DMPS)

 

Figure 6: Comparison of CPU usage of UDP data packets sent by DMPS

Transmission Rate (Mbit/s)

64 kcache

128 kcache

256 kcache

0.4

0.2

0.13

0.14

0.8

0.35

0.28

0.27

2

0.82

0.7

0.67

4

1.62

1.34

1.34

8

3.65

2.69

2.68

Table 2: CPU usage (%) of UDP data packets sent from d642)

 

Figure 7 Comparison of CPU usage of UDP data packets received by DMPS

From the above comparison, we can see that the CPU usage when sending and receiving data packets increases with the increase of the network transmission rate, and basically shows a linear relationship. Because sending and receiving data is a simple migration of data, its complexity increases linearly with the increase of data, and the CPU usage increases linearly when the cache speed is fixed.

The size of the second-level cache also affects the CPU usage. Generally, the larger the L2 cache, the smaller the CPU usage, and the more obvious it is as the sending and receiving data rate increases, this benefits from the working principle and powerful DMA functions of the two-level cache of the dm642.

The additional impact of L2 cache increase is the reduction of memory capacity in the CPU chip, which reduces the number of segments and data segments that can be put down in the chip. This slows down the program running speed, this is especially evident when dealing with complex codec programs, with many data segments and code segments. This requires programmers to make reasonable arrangements based on the actual situation.

TCP/IP stack-udp

 

Beta-song @ 2008-8-24

Reprinted from: http://www.diybl.com/course/6_system/linux/Linuxjs/2008829/138684.html

 

This article briefly analyzes the process of sending and receiving UDP data packets, without in-depth protocol details.

 

 

UDP protocol entry

 

Net/IPv4/f_inet.c, UDP operation set

Const struct proto_ops inet_dgram_ops = {

. Family = pf_inet,

. Owner = this_module,

. Release = inet_release,

. Bind = inet_bind,

. Connect = inet_dgram_connect,

. Socketpair = sock_no_socketpair,

. Accept = sock_no_accept,

. Getname = inet_getname,

. Poll = udp_poll,

. IOCTL = inet_ioctl,

. Listen = sock_no_listen,

. Shutdown = inet_shutdown,

. Setsockopt = sock_common_setsockopt,

. Getsockopt = sock_common_getsockopt,

. Sendmsg = inet_sendmsg,

. Recvmsg = sock_common_recvmsg,

. MMAP = sock_no_mmap,

. Sendpage = inet_sendpage,

};

 

The above ops function will eventually correspond to relevant functions on the UDP protocol. For example, inet_sendmsg is implemented internally through the following calls: SK-> sk_prot-> sendmsg (iocb, SK, MSG, size), where struct proto * skc_prot points to the following udp_prot.

 

Net/IPv4/udp. C, UDP protocol

Struct proto udp_prot = {

. Name = "UDP ",

. Owner = this_module,

. Close = udp_lib_close,

. Connect = ip4_datagram_connect,

. Disconnect = udp_disconnect,

. IOCTL = udp_ioctl,

. Destroy = udp_destroy_sock,

. Setsockopt = udp_setsockopt,

. Getsockopt = udp_getsockopt,

. Sendmsg = udp_sendmsg,

. Recvmsg = udp_recvmsg,

. Sendpage = udp_sendpage,

. Backlog_rcv = udp_queue_rcv_skb,

. Hash = udp_lib_hash,

. Unhash = udp_lib_unhash,

. Get_port = udp_v4_get_port,

. Obj_size = sizeof (struct udp_sock ),

};

 

The sending process corresponds to the udp_sendmsg function, and the receiving function corresponds to the udp_recvmsg function. Next, analyze them separately.

 

 

Sending Process

 

Int udp_sendmsg (struct kiocb * iocb, struct sock * SK, struct msghdr * MSG, size_t Len ){

// Step 1: Check whether a pending frame exists. If yes, the message is sent to step 4.

// Step 2: Obtain the route. If no route is saved, use ip_route_output_flow to create a route entry.

RT = (struct rtable *) sk_dst_check (SK, 0 );............

// Step 3: block udp_sock to send data to prevent the current package from being sent, and send new data

Up-> pending = af_inet ;............

// Step 4: collect and send data (the data may be in the iovec vector, so it is necessary to collect the data together first)

// Make one large IP datatime from your pieces of data.

// Each pieces will be holded on the socket until ip_push_pending_frames () is called

// The ip_append_data function calls _ skb_queue_tail (& SK-> sk_write_queue, SKB) internally and queues data packets on the write queue.

Err = ip_append_data (SK, getfrag, MSG-> msg_iov, Ulen,

Sizeof (struct udphdr), & IPC, RT,

Corkreq? MSG-> msg_flags | msg_more: MSG-> msg_flags );

If (ERR)

Udp_flush_pending_frames (SK); // an error occurred. Discard all data in the current package and disable pending.

Else if (! Corkreq)

Err = udp_push_pending_frames (SK); // call ip_push_pending_frames to send data

Else if (unlikely (skb_queue_empty (& SK-> sk_write_queue )))

Up-> pending = 0; // cancel pending so that new data can be sent

Release_sock (SK );

}

 

 

Int ip_push_pending_frames (struct sock * SK ){

// Step 1: collect all data from the write queue

While (tmp_skb = _ skb_dequeue (& SK-> sk_write_queue ))! = NULL ){......}

// Step 2: Send

Err = nf_hook (pf_inet, nf_ip_local_out, SKB, null,

SKB-> DST-> Dev, dst_output); // dst_output sends data from the transport layer to the network layer

}

 

 

Static inline int dst_output (struct sk_buff * SKB ){

Return SKB-> DST-> output (SKB); // The sending function of the route table entry is ip_output at the network layer,

// Continue to call ip_finish_output ......

}

 

 

Receiving Process

 

Net/IPv4/udp. c

Int udp_recvmsg (struct kiocb * iocb, struct sock * SK, struct msghdr * MSG,

Size_t Len, int Noblock, int flags, int * addr_len ){

SKB = skb_recv_datagram (SK, flags, Noblock, & ERR); // if there is data, it is returned. Otherwise, it is blocked.

Err = skb_copy_datagram_iovec (SKB, sizeof (struct udphdr ),

MSG-> msg_iov, copied); // copy data to the user space

}

Struct sk_buff * skb_recv_datck (struct sock * SK, unsigned flags, int Noblock, int * ERR ){

Do {

SKB = skb_dequeue (& SK-> sk_receive_queue); // retrieves data packets from the sock receiving queue

} While (! Wait_for_packet (SK, err, & timeo); // blocking if necessary

}

Static int wait_for_packet (struct sock * SK, int * err, long * timeo_p ){

Define_wait (wait); // the current process is added to the waiting queue.

Prepare_to_wait_exclusive (SK-> sk_sleep, & wait, task_interruptible );

If (! Skb_queue_empty (& SK-> sk_receive_queue ))

Goto out; // don't wait because there is data

* Timeo_p = schedule_timeout (* timeo_p); // This process is sleep

Out:

Finish_wait (SK-> sk_sleep, & wait); // Delete the process from the waiting queue and wait

Return Error;

}

About timeo_p

The question about the sleep time is as follows: Noblock? 0: SK-> sk_rcvtimeo;

For non-blocking calls, if the sleep time is 0, it is not blocked; otherwise, SK-> sk_rcvtimeo is blocked for a long time.

In sock_init_data, set the time To max_schedule_timeout, and the value is defined as long_max, that is, never time out.

 

When no data is readable, The UDP process blocks the wait. The following is the process for net core to receive data and wake up the UDP process.

 

Net/IPv4/ip_input.c

Static inline int ip_local_deliver_finish (struct sk_buff * SKB ){

If (ipprot = rcu_dereference (inet_protos [hash])! = NULL) {}// locate the specific transport layer protocol

Ret = ipprot-> handler (SKB); // call the handler. The following analysis shows that the handler is udp_rcv.

}

 

Net/IPv4/af_inet.c

Static int _ init inet_init (void ){

(Void) sock_register (& inet_family_ops );

If (inet_add_protocol (& udp_protocol, ipproto_udp) <0 ){}

}

Static struct net_protocol udp_protocol = {

. Handler = udp_rcv,

. Err_handler = udp_err,

. No_policy = 1,

};

 

Net/IPv4/udp. c

Int udp_rcv (struct sk_buff * SKB)

À _ udp4_lib_rcv (SKB, udp_hash, ipproto_udp );

À udp_queue_rcv_skb (SK, SKB );

À sock_queue_rcv_skb (SK, SKB );

À skb_queue_tail (& SK-> sk_receive_queue, SKB); // put it in the receiving queue of sock

À SK-> sk_data_ready (SK, skb_len); // The upstream notification data is ready.

 

For the inet protocol cluster, sk_data_ready is sock_def_readable. See the following analysis:

Static struct net_proto_family inet_family_ops = {

. Family = pf_inet,

. Create = inet_create, // call sock_init_data (sock, SK );

. Owner = this_module,

};

Void sock_init_data (struct socket * sock, struct sock * SK ){

SK-> sk_data_ready = sock_def_readable;

SK-> sk_rcvtimeo = max_schedule_timeout;

}

 

Static void sock_def_readable (struct sock * SK, int Len ){

Read_lock (& SK-> sk_callback_lock );

If (SK-> sk_sleep & waitqueue_active (SK-> sk_sleep ))

Wake_up_interruptible (SK-> sk_sleep );

Sk_wake_async (SK, 1, poll_in );

Read_unlock (& SK-> sk_callback_lock );

}

It can be seen that sock_def_readable wakes up the UDP process waiting on the socket and allows it to continue reading data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.