High and Low-Level C
高層和底層C
Jim Larson
1996-09-13
This talk was given at the Section 312 Programming Lunchtime Seminar.
Introduction
介紹
Tower of languages. High-level languages can (mostly) compile to lower-level ones.
在語言之塔山,高層語言幾乎絕大部分能編譯到底層語言。
Might want to write in low-level language for access to hardware level. Can write high-level code in low-level language by compiling in your head.
可能想要寫存取硬體層的底層語言,通過在你的頭腦在編譯,用底層語言寫出高層語言。
Might want to write in high-level language for flexibility and power. Can write low-level code in high-level language by "writing through the compiler".
可能想要寫具有靈活性和力量的高層語言,通過編譯器手段,可以用高層語言寫出底層語言。
C Features
C特性
Recursive Functions
遞迴函式
C has a stack used for the call stack, activation records, and local variables.
Note that functions are not nested, as in Pascal. This affords greater freedoms for using function pointers.
C有一個棧用於調用、活動記錄和局部變數。注意在Pascal中函數不能嵌套。使用函數指標提供了更大的自由。
/*
* Simple example of a recursive function.
*/
unsigned
fib(unsigned n)
{
if (n == 0 || n == 1)
return 1;
else
return fib(n - 1) + fib(n - 2);
}
This simple example is a little contrived, as well as a lousy way to compute Fibonacci numbers. A better example shows the natural relation between recursive functions and recursive data structures - those that have references to other objects of the same type.
作為一個糟糕的計算Fibonacci數列的簡單例子,代碼是故意為之的。一個更好的例子顯示了遞迴函式和遞迴資料結構(引用到其他同類型資料結構的對象)之間自然的聯絡
/*
* Recursive functions work well with recursive data structures.
* 遞迴函式能與遞迴資料結構很好的工作
*/
typedef struct Expr Expr;
struct Expr {
enum { Number, Plus, Minus, Times, Divide } op;
union {
double number;
struct {
Expr *left, *right;
} child;
} u;
};
double
eval(Expr *e)
{
switch (e->op) {
case Number:return e->u.number;
case Plus:return eval(e->u.child.left) + eval(e->u.child.right);
case Minus:return eval(e->u.child.left) - eval(e->u.child.right);
case Times:return eval(e->u.child.left) * eval(e->u.child.right);
case Divide:return eval(e->u.child.left) / eval(e->u.child.right);
}
}
Dynamic memory allocation
動態記憶體分配
Stack-allocation - local variables.
靜態記憶體配置-局部變數
Heap-allocation. Library only, but pervasively used. (Actually, this is our first example of a high-level feature implemented entirely in Standard C.)
堆分配,僅用於庫,但被普遍使用(其實這是我們的第一個高層實現的例子,完全用標準C實現)
Abstract Data Types
抽象資料類型
C Type theory is kind of confusing: Types where you know the size of the object Types where you don't know the size of the object void *
Tricks: You can declare and use a pointer to an incomplete type. You can complete an incomplete type later. Pointers to structures can be coerced to and from pointers to their first element, if it also is a structure.
C類型理論是一種混淆:你知道物件類型的大小,但是不知道void *物件類型的大小
技巧:你可以申明一個不完全類型的指標,後繼再完成類型。指向結構的指標和指向第一個元素(如果也是一個結構的話)的指標可以強制轉換
/* In widget.h */
typedef struct Widget Widget;
extern Widget *widget_new(double length, enum Color color);
extern double widget_mass(Widget *w);
extern int widget_fitsin(Widget *w_inner, Widget *w_outer);
extern void widget_delete(Widget *w);
The implementation gets to hide information about the representation, as well as guarantee invariants.
/* In widget.c */
#include <stdlib.h>
#include "colors.h"
#include "widget.h"
/*
* Non-public definition of Widget structure, declared in "widget.h".
*/
struct Widget {
Widget *next;/* widgets are stored on a linked list */
int id;/* identification stamp */
double length;/* length in centimeters */
double mass;/* mass in grams */
enum Color color;/* see "colors.h" for definitions */
};
static const double widget_height = 2.54;/* in centimeters */
static const double widget_density = 1.435;/* in g/cm^3 */
static Widget *widget_list = 0;
static int widget_next_id = 0;
/*
* Create a new widget. Calculate widget mass. Keep track of
* bookkeeping with id number and store it on linked list of widgets.
*/
Widget *
widget_new(double length, enum Color color)
{
Widget *w = malloc(sizeof (Widget));
if (!w)
return 0;
w->next = widget_list;
widget_list = w;
w->id = widget_next_id++;
w->length = length;
w->mass = 0.5 * length * length * widget_height * widget_density;
w->color = color;
return w;
}
Nonlocal exits
非局部存在
Setjmp/longjmp work like a bunch of immediate returns from functions. Intermediate functions don't need to make provisions for this - modular way to raise error conditions.
Viewing function call/return sequences (aka procedure activations) as a tree, longjump can only work on a saved jmp_buf from a parent in the tree.
Setjmp/longjmp工作如同從函數中的一連串返回,中間函數不需要負擔模組化的方法拋出錯誤條件
#include <signal.h>
#include <setjmp.h>
static jmp_buf begin;
static void
fpecatch(int sig)
{
warning("floating point exception");
longjmp(begin, 0);
}
void
command_loop(void)
{
for (;;) {
if (setjmp(begin)) {
printf("Command failed to execute!/n");
}
signal(SIGFPE, &fpecatch);
prompt();
do_command(read_command());
}
}
High-Level C
高層C
Classes and objects
類和對象
The core of OO is "Dynamic Dispatch" - you don't know which function you're calling. Function pointers also provide this kind of indirection.
Structure coercion and nesting provide for single-inheritance of classes.
OO的核心是"動態分配"-你不知道你正在調用哪一個函數。函數指標也提供了這種間接性。
Method calls can be masked by functions or macros. Can "crack open" the abstraction to cache methods.
方法調用可以用函數或者宏來掩飾。可以"駭客式的開啟"到緩衝方法的抽象。
/* In shape.h */
typedef struct Point Point;
struct Point {
double x, y;
};
typedef struct Shape Shape;
struct Shape {
void (*move)(Shape *self, Point p);
void (*scale)(Shape *self, double factor);
void (*rotate)(Shape *self, double degrees);
void (*redraw)(Shape *self);
};
extern Shape *make_triangle(Point center, double size);
In the implementation:
在實現中:
/* In triangle.c */
#include <stdlib.h>
#include "shape.h"
typedef struct Triangle Triangle;
struct Triangle {
Shape ops;
Point center;
Point voffset[3];
};
static void
tri_move(Shape *self, Point p)
{
Triangle *t = (Triangle *) self;
t->center = p;
}
static void
tri_scale(Shape *self, double factor)
{
Triangle *t = (Triangle *) self;
int i;
for (i = 0; i < 3; ++i) {
t->voffset[i].x *= factor;
t->voffset[i].y *= factor;
}
}
static void
tri_redraw(Shape *self)
{
Triangle *t = (Triangle *) self;
Point c = t->center;
Point v0 = addpoint(c, t->voffset[0]);
Point v1 = addpoint(c, t->voffset[1]);
Point v2 = addpoint(c, t->voffset[2]);
drawline(v0, v1);
drawline(v1, v2);
drawline(v2, v0);
}
Shape triangle_ops = { &tri_move, &tri_redraw, &tri_scale, &tri_rotate };
Shape *
make_triangle(Point center, double size)
{
Triangle *t = malloc(sizeof (Triangle));
if (!t)
return 0;
t->ops = triangle_ops;
t->center = center;
t->voffset[0].x = size * V0X;
t->voffset[0].y = size * V0Y;
t->voffset[1].x = size * V1X;
t->voffset[1].y = size * V1Y;
t->voffset[2].x = size * V2X;
t->voffset[2].y = size * V2Y;
return &t->ops;
}
In a client module that uses the interface:
使用該介面的用戶端模組:
/* In animate.c */
void
pulsate(Shape *s, double period)
{
double factor = 1.0, step = 0.1;
int i;
void (*scale)(Shape *, double) = s->scale;/* cache method */
void (*redraw)(Shape *) = s->redraw;
for (;;) {
for (i = 0; i < 10; ++i) {
factor += step;
(*scale)(s, factor);
(*redraw)(s);
sleep(1);
}
step *= -1.0;
}
}
Closures
閉包
In Scheme, abstractions carry their environment with them. This is like bundling a function pointer and some data to work with. The data acts like "configuration" data. Can either make it an explicit argument, or create ADT for closure and make "apply_closure" function - breaks from ordinary function syntax.
Much like objects, implemented above, but less constrained in use.
在Scheme中,抽象攜帶他們自己的環境,這就像函數指標和相關的資料一樣。資料就像配置資料一樣,可以是顯式的參數,或者從閉包中建立ADT,和使用apply_closure函數,儘管這打破了常規函數的文法。
/* In closure.h */
typedef struct Closure Closure;
struct Closure {
void *(*fn)(void *);
void *data;
}
inline void *
appclosure(Closure *t)
{
return (*t->fn)(t->data);
}
Exception handling
異常處理
Want to be able to raise a certain kind of error to by handled by a designated handler, but with the "linkage" established dynamically.
想要有能力升起特定類型錯誤通過一個設定的異常,可以動態連結(感覺這句很難翻譯)
#include "exception.h"
void
logcommands(char *filename)
{
if (!(f = fopen(filename)))
THROW_ERROR(Err_filename);
CATCH_ERROR(ALL_ERRS) {
fflush(f);
fclose(f);
THROW_ERROR(throwerror);
} ERRORS_IN {
while ((x = read_input(stdin)) != EOF) {
a = compute_result(x);
print_result(a, f);
}
} END_ERROR;
}
The implementation is kind of tricky - use of ternary expression to make complicated series of tests into a syntactic expression:
這種實現是一種把戲-使用三元運算式製造複雜序列的測試變成文法運算式
/* In exception.h */
const int maxcatchers = 100;
extern jmp_buf catchers[maxcatchers];
extern volatile int nextcatch;
extern volatile int throwerror;
#define ALL_ERRS 0
#define CATCH_ERROR(E) /
if ((nextcatch == maxcatchers) /
? error("too many exception handlers") /
: (setjmp(catchers[nextcatch++]) == 0) /
? 0 /
: ((E) != ALL_ERRS && throwerror != (E)) /
? longjmp(catchers[--nextcatch]) /
: 1)
#define ERRORS_IN else
#define END_ERROR do { --nextcatch; } while (0)
#define THROW_ERROR(E) /
do { /
throwerr = (E); /
longjmp(catchers[--nextcatch]); /
} while (0)
Continuations
協程
Scheme's general continuations will let you resume at any previous location in call tree - cool for escapes, coroutines, backtracking, etc.
Setjmp/longjmp will let you jump up to a known location. Disciplined use can result in catch/throw.
Scheme的通用協程能夠讓你在調用樹上恢複到前面的位置--對於逃逸,協程和方向跟蹤很酷。
By making stacks explicit, you can do coroutines.
通過使得棧顯式,你可以實現協程
By using a continuation-passing style, you can do arbitrarily cool things using Cheney on the MTA. See paper by Henry Baker.
通過使用協程傳遞的風格,你可以在MTA上使用Cheney做任何很酷的事情。請看Henry Baker的論文
typedef void *(*Genfun)(void);
void
trampoline(Genfun f)
{
while (f)
f = (Genfun) (*f)();
}
Garbage-collected memory
垃圾收集記憶體
Through explicit maintenance of roots into the GC'd heap, and the types of objects, you can safely ignore free() and the associate bookkeeping.
"Conservative GC" implementations allow GC behavior for standard programs. See work by Hans Boehm, or the commercial product by Geodesic Systems.
通過顯式在GC堆中維護根還有物件類型,就可以安全的忽略free和相關的簿記工作。
Cheney on the MTA gives simple generational capabilities.
MTA上的Cheney給出了簡單的範型能力
Low-level C
底層C
Bounded memory usage
有界記憶體使用量
Ted Drain's idea: have a malloc() substitute profile allocation for a sample run, then can build a dedicated allocator for a heap whose size is known at compile-time.
Ted Drain的思想:對於簡單的運行,做一個malloc的替代剖析分配,然後為堆建立一個專用的分配器,堆尺寸在編譯時間已知。
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
static size_t allocbytes = 0;
void
printalloc(void)
{
fprintf(stderr, "Allocated %lu bytes/n", (unsigned long) allocbytes);
}
void
ememinit(void)
{
int err;
err = atexit(&printalloc);
assert(err == 0);
}
void *
emalloc(size_t size)
{
void *p = malloc(size);
assert(p);
allocbytes += size;
return p;
}
Bounded call stack depth
有界調用棧深度
Trampoline back and manage stack explicitly.
Can also implement a state-machine within a function.
從常規方法中返回來顯式的管理棧,能夠在一個函數中實現狀態機器
Data type representation
資料類型表示
Structure ordering and offsets - make explicit with arrays.
Byte-level representation - use arrays of characters. Portable binary representations of output.
結構順序和位移-用數組顯式實現。
直接層次的表示-使用數組特性,輸出的可移植的二進位表示
typedef char ipv4hdr[20];
struct {
char *fieldname;
int byteoffset;
int bitoffset;
int bitlength;
} ipv4fields[] = {
{ "vers", 0, 0, 4 },
{ "hlen", 0, 4, 4 },
{ "service type", 1, 0, 8 },
{ "total length", 2, 0, 16 },
{ "identification", 4, 0, 16 },
{ "flags", 6, 0, 3 },
{ "fragment offset", 6, 3, 13 },
{ "time to live", 8, 0, 8 },
{ "protocol", 9, 0, 8 },
{ "header checksum", 10, 0, 16 },
{ "source ip address", 12, 0, 32 },
{ "desination ip address", 16, 0, 32}
};