Ten Tips for Writing CS Papers, part 2
This continues the first part on tips to write computer science papers.
6. Ideal Structure of a Paragraph
A paper has different levels of formal structure:sections, subsections, paragraphs, sentences. It's important to ensure that the structure of the content aligns well with the formal structure because the formal struc Ture is readily perceived by the reader, whereas the structure of the content is not. With a good alignment we do it easy for the reader and the right mental model for the organization of the content; This enables a better navigation and memory of the content.
An important consequence of a well organized paper are to minimize the possible surprise for the reader. Want to surprise readers and how amazing your method or achievements is, but not through the Organiza tion of the paper.
How to align the content with the formal structure? There is more to say about this and I recommend the references at the end of this article, but where I want to focus on the Structure of one or multiple paragraphs. The basic rules are:
- One paragraph should contain only a single idea or a single point of argumentation.
- The beginning and the end of a paragraph glue the paragraph into the surrounding content.
There is an ambiguity as to what constitutes a separate idea and indeed paragraphs could be of quite different lengths.
To achieve a good structure, this is a recipe, so works for me. For a section I would like to write I make a list of bullet points of things I want to say, with one bullet point being a Single idea or important point. Each of the dependencies on points and I use the dependencies to order the list. Finally, I write one paragraph for each item on the list and I could add an additional paragraph at the beginning and end of The sections to connect the sections to the surrounding content.
I found that this recipe also makes my job as a writer easier because it overcomes my writing inhibition in both ways. First, I can start by simply making a list and this does not feel like writing. Second, once the ordering of ideas is clear, the actual writing becomes a lot simpler.
Here are an example of a less-than-ideal paragraph from sections 2.3 in (Gehler and Nowozin, 2008).
As already mentioned to our knowledge (Argyriou et al., 2006) were the first to note the possibility of a infinite SE T of base Kernels and they also stated the Subproblem (Problem 1). We'll defer the discussion of the Subproblem to the next sections and shortly comment on the differences of the algorithm Of (Argyriou et al, 2006) and the IKL algorithm. We denote with g the Objective value of a standard SVM classifier with loss function l . "
Let us reverse engineer the content of this paragraph, then restructure it. The paragraph makes the points:first, a connection to the work of (Argyriou et al., 2006). Second, it establishes some notation. So it should perhaps is split into the paragraphs.
For the first point, the beginning are also less than ideal: "As already mentioned to our knowledge"; It is a bit redundant, and apologetic to point out, that we already mentioned it and that we are not know better. The second point, the notation, was okay by itself, but it's unclear why it follows the First:is it did in order to Enab Le the comparison between approaches? We would need to read ahead to find out. (This was indeed the case.) Here is a proposed improvement:
(Argyriou et al, 2006) first recognized the possibility of an infinite set of base kernels and we now di Scuss the connection to our work.
To make the connection explicit we first establish the notation we'll use throughout the paper. We use g to denote the Objective value of a standard SVM classifier, Where l is the loss function.
It is simpler to read and makes it clear why we introduce the notation. Also note the end and beginning of the paragraphs:the end of the first paragraph tells what comes next ("th E Connection to We work "), the beginning of the second paragraph tells what you have done (through notation). The flow between the paragraphs is natural now and they could almost being merged into one again with the one The resulting paragraph being "the connection between (Argyriou et al.) and our work".
7. Avoid ambiguous Relative pronouns (this, these, that, which)
When used properly, a relative pronoun, such as ' this ', ' these ', ' that ', ' which ', can effectively refer to a previously me ntioned noun, and that have to is remembered by the reader.
In the previous sentence, which entity does "that" refer to? Is it "a previously mentioned noun"? Or is it "a relative pronoun"? Or is it the proper use?
Ambiguities of relative pronouns is common because the writer does not experience the ambiguity. After all, it's clear to the writer what he refers to. Train yourself to recognize any potentially ambiguous relative pronoun, ideally by using a highlighter to mark them in a P Rintout.
To resolve the ambiguity the easiest solution are simply to add the noun it refers to. For the above example, ' that ' would become ' that noun '.
(another issue I ran into frequently was in deciding between "which" in cases where "that" should has been used, such as I N "We use a algorithm which is efficient." I remember annoying a former American colleague of mine by using "which" a bit too often. Some advice is available.)
Here are a real example from a ICDM to paper of mine. I highlight all relative pronouns.
Extracting such geometric patterns from molecular 3D structures are one of the central topic in computational biology, and Numerous approaches has been proposed. Most of them is optimization methods, which detect one pattern at a time by minimizing a loss function (e.g., [1 4, 15, 6]). They is different from we approach enumerating all patterns satisfying a certain geometric criterion. In particular, they does not has a minimum support constraint. Instead They try to find a motif this matches all graphs.
This isn't the worst example but can be improved nevertheless. The first "which" was best removed, and the other relative pronouns was best clarified. Here is a proposed improvement:
Extracting such geometric patterns from molecular 3D structures are one of the central topic in computational biology, and Numerous approaches has been proposed. Most of them is optimization methods, detecting one pattern at a time by minimizing a loss function (e.g., [14, 15, 6]). These optimization methods is different from we approach enumerating all patterns satisfying a certain geometri C criterion. In particular, other methods does not has a minimum support constraint and instead try to find a motif th At matches all graphs.
8. Provide continuation markers
Continuation markers is sentences or paragraphs, typically at the beginning of sections, to tell the reader Would be presented next and to tell the reader how it's relevant or how it relates to what have been presented already. IT provides structure and flow, connecting the different parts of the paper.
This is a example, from ICCV paper:
"3. Method
We now describe the model for tracking fast moving objects. While the motion model was standard, the observation model for raw ToF captures is a novel contribution. "
Note elements Here:first, there is a explicit statement of what would be presented next (the model for tracking fast Moving objects). Second, we establish relevance with respect to the contribution.
There is and reasons why thinking about natural continuation markers for reading the paper is important. First, it enables navigation through the paper by allowing the reader to skip sections more efficiently. Second, without the necessary background it may take a reader multiple repeated readings to fully understand the paper. If you lost the reader, providing a natural re-entry point makes it easier to continue reading the paper despite a lack of Understanding of some parts.
Both reasons is especially important for reviewers, a special type of reader. Ideally the reviewer is a expert in the field already, so we would what it easy for him to quickly navigate to re Levant parts of the paper. Less ideally, the reviewer was working under time pressure or without keen interest in the work; In this case we would like to minimize misunderstanding or missing important points during reading.
It is important to co-locate the continuation markers with the actual text itself. It is not sufficient to provide a mini table-of-contents as part of the introduction ("In section 2 we present related wor K. In sections 3 we present our method. etc. ").
9. Multiple Authors
It is a reality the most computer The science papers be authored by multiple authors. Coordinating the writing between multiple authors can be challenging on both the level of content and in terms of Technolo Gy.
In terms of content, in my experience a recipe for disaster are to divide the paper into parts and agree that "Aut Hor A would write the introduction, author B would write the method, etcetera ". The resulting draft would be incoherent and everyone have a excuse for delaying their part due to perceived dependencies (" I'll write the method once the notation is defined in the introduction "," I'll write the introduction when we have res Ults ").
Also, when dividing up work this is the draft can is poorly balanced in terms of relevant parts, as sub-authors tend to B E assigned to the parts they has contributed to the most, which provides a incentive to describe their own contribution In too much detail (for example senior authors writing the introduction would fill it discussing their past That's led to the work; The author writing about the implementation would want to go into detail because it is really difficult to get it to work And people may miss just how difficult it was, etcetera).
It is better to assign responsibility to a single author to write a full draft and then iterate together over this draft. There is and reasons why it's better: First, clear responsibility gets stuff done; second, the draft would be more coherent with a more linear flow of arguments.
The single author draft works best if the draft writer was an experienced author because iterating on a poorly organized Draft effort than a complete rewrite. When iterating on a draft it was important to distinguish substantial from minor changes. minor changes Is changes this fix issues locally, such as adding a sentence for clarification, changes of word order, typos, etc. These changes is important but not urgent. Most accomplished authors I know prefer to make these changes in passes through the full paper, much like Polishi ng the paper with each reading.
substantial changes are things like addition or removal of sections, changing the order of the Presentati On, enlarging or shrinking the claimed contribution, etcetera. Such changes can has large implications on the other parts of the paper which need to be addressed and therefore Such Cha Nges is important and urgent because they require less time if made early.
in terms of technology , I frequently experienced problems due to the diversity of authors and their working St Yle. Often Some authors'll be senior authors with a proven but dated work setup, for example, not using basic version control Systems and being stuck in an unflexible editor that mangles LaTeX every time it opens a file. To be fair, these authors is often most essential in terms of providing feedback on the content of the paper and they may There are little time available to stay-to-date with the latest tools. For addressing this problem with technology, my recommendations is the following:
- Use a version control system:this should almost go without saying and even if you are the sole author of a paper it's be ST-to-use-a version control system because it provides a simple method-to-back your work up. But for multiple authors coordinating the writing of a paper without a version control system was simply a waste of time an D nerves of everyone involved.
- Use a friendly version control system, provides a simple web interface; BitBucket is my favorite for paper writing because it offers free private git repositories and allows your to view changes In a neat timeline in the browser. While hardly surprising to any git user, this feature is readily appreciated by everyone. Also, for minor changes BitBucket actually allows editing from within the browser.
- For yourself:when writing LaTeX write one sentence on a line and use a line break after each sentence. This makes merging conflicts easier and leads to fewer surprises with strange editors breaking long lines. (I also found that this helps me to improve the organization of a paragraph because every sentence today starts at the begin Ning of a line.)
- When you need-only feedback from your coauthors, sending them a PDF for annotation via email could still be the M OST efficient.
Ten. Authorship and Author ordering
Except for the writing itself, another common problem with multiple authors are discussions about authorship and author Ord ering. While not related to writing papers per se, I do want to share some remarks on the this topic. There is only a few common situations where debates about author ordering arise. Here is a few common examples, with the more common cases first:
- A Small contributor or someone involved in early discussions wants to is A co-author, but other authors disagree Based on the amount of time they contributed.
- There is a PhD student, a post-doc, and a faculty author and in most computer science venues the recognition is strong EST for the first and last author position. The post-doc feels he guided the student the most so deserves to being recognized, but the faculty member may feel different Based on seniority or being the source of funding.
- students contributed to a piece of work and see their contribution as the strongest; this happens sometime s when a student postpones a line of work and another student are continuing with the work, directed by a joint supervisor.
- or more senior authors feel this they started or guided the project the most.
Obviously there is No. "right-" to handle-circumstances, and indeed computer science handles authorship differently To, say, mathematics, for example. Of course everyone agrees that scientific authorship should imply substantial contributions to the work, and that's about As ambiguous a statement as can be made. To is more concrete, here is some observations.
First, some conflicts can is anticipated, for example the case of the students. Here, it's best to discuss a possible publication and authorship as soon as the second student gets involved. This discussion should is summarized via email for the future reference. Likewise for the case of the small contributor, as soon as it's clear the work would end up in a publication a discussion Should help to set expectations, for example to offer authorship only if additional work is invested.
Second, as a young PhD student one naturally underestimates the implicit future benefits that arise from Co-autho Rship. For example the senior co-authors could present the work at venues otherwise inaccessible, or the work would leads to substant Ial collaborations with the original co-authors.
third, when considering whether to include a small contributor as co-author, the problem are most often not the CO -authorship itself, but possible future actions by the contributor after the paper are published (for example, giving Semin AR talks about the paper). The other authors feel, the credit and opportunities is taken away from them. By discussing not just the co-authorship itself early, instead also what's future paper-related actions is done by whom These problems can be avoided. For example, all authors could agree that seminar and job talks about the work should is presented by the leads author.
Recommended Reading
I have bought many books on writing and especially when I started my PhD. But there was one that stands above all others, and if you were writing papers I can recommend this to you, no matter whethe R just start out or been writing since decades.
This book is "Scientific writing:a Reader and Writer's Guide" by Jean-luc Lebrun.
acknowledgements. Thanks to Jonathan Strahl for corrections to the article.
Ten Tips for Writing CS Papers, part 2