Will My Paper Be Accepted at FOCS?
A possible way to help authors write better papers for conferences
Luca Trevisan is one of the greatest theorist in the world. Besides his seminal work on many aspects of theory, he has a terrific blog called in theory, and is the current chair of the upcoming FOCS conference. See here for more details on the conference.
Today I want to talk about the process of getting a paper accepted by the FOCS program committee. I have an idea that could make the process better for us all.
I have already discussed an idea recently for deniability for program committees; today’s idea is to help both the committee and the authors. There should be a site called Will my paper be accepted? It would work like this:
After you have finished a draft of your paper, but before you submit it to FOCS, you would go to the site and upload your pdf file. Then instantly you will get back a number from to .
The number is an estimation of the probability of whether or not your paper will be accepted by the program committee. Thus, if you get back , you should feel pretty good—your paper is likely to be accepted. If you, instead, get back , then you are in trouble. Perhaps you should not submit your paper. Or perhaps you need to spend many hours working on the next draft: cleaning up the writing; adding some more background, motivation, and references; making the proofs look harder; and in general making the paper “better.”
Note, the site helps both authors and the committee. Authors are aided, since they are more likely to submit solid papers. The program committee is aided, since they will get to read better papers. A win-win situation.
How It Works
Okay the idea is fun—I hope you agree—but it is also a serious suggestion. I think that the site could be built, not just for FOCS, but for any conference, even for journals, even for NSF. It is a stretch to expect the site to check the correctness of proofs in the papers, but I feel it should be at least feasible to build a site that measures the “hotness” or relevance of the paper. Of course the NSF version would be: Will my proposal be funded?
The site would be implemented using machine learning technology. In the case of FOCS we have the list of accepted papers from the last few conferences; the program committee also has—in principle—access to all the rejected papers for those conferences. Then, the problem is a classification problem: is the given paper closer to the accepted papers or closer to the rejected ones? I am not an expert in machine learning, but it seems plausible to me that this classification problem could be done reasonably well by a program. Note, the site makes no guarantee on its prediction accuracy—it never says nor .
I see three interesting side issues: one affects co-authored papers, and one is a possible privacy issue. The last is a research issue.
Suppose Alice and Bob are writing a paper together. Imagine Alice writes her draft and gets a rating of . She then asks Bob to hack away and write the next draft. What happens if his draft gets a rating of ? What’s up? Should Alice be upset? Or what if Bob increased the rating to , but Alice’s next draft is rated . Should Alice think she is doing more than her share of the work?
There are also potential privacy issues. If the site uses rejected papers, are there any security concerns? Can someone submit lots of papers to the site and get information about a rejected paper? Is this a serious concern, or is it not? Perhaps there is a paper to be written for FOCS on how to make a rating program preserve privacy. Of course this issue could be avoided if the ratings were only based on accepted papers. I think, however, using both accepted and rejected papers should greatly increase the accuracy of the predictions.
The research issue may be well studied in machine learning theory, but I have not been able to determine if it is a standard question. The issue is: consider the set of accepted papers and of rejected papers. The standard machine learning problem is to try and determine the probability of acceptance based on how close a paper is to vs . The twist here is both and have an additional structure: time. Clearly, the interests of the FOCS conference change over time—what was in a few years ago may be out now. This implies more recent papers should have a higher weight than older papers. Can current machine learning methods handle this?
Can we build such a site? Should we build such a site? Would you use this site? What do you think?