Conference Reports: International Conference in Machine Learning (ICML) 2023

Conference Reports: International Conference in Machine Learning (ICML) 2023

Sonja Kraiczy (University of Oxford, https://sonja.uk)

I attended ICML 2023, held from July 23rd to July 29th, 2023, at the Hawaii Convention Center on Oahu, the Hawaiian Island. My main reason for attending was to present a
paper accepted at the main conference. The paper, titled "Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist," is a
collaborative effort with Piotr Faliszewski and Nicolas Boehmer. This collaboration began during a research visit that Niclas and I made to Professor Piotr Faliszewski in Krakow in August 2022 and is still ongoing.

On the research I presented

The part of our work that I presented focused on analyzing a widely used random model for ranked data. We explored this model both theoretically and in its applicability to real-world scenarios where the number of ranked items can vary. This model, characterized by a fixed parameter, is often used in the literature to generate realistic data for evaluating algorithmic properties, among other things. We compared the fixed parameter approach with similar methods as the number of items increased and explored the concept of rescaling the parameter based on the number of items. The two models yielded different theoretical predictions for various properties of the data. This allowed us to examine real-world datasets with varying item counts and compare their statistics to the predictions of each model. Our findings led us to caution against relying solely on the fixed parameter assumption and recommend using the scaled model for simulations by default.

paper

Properties of the Mallows Model Depending on the Number of Alternatives

poster

Perks of a large conference with a broad audience

I presented our work in the form of a poster during the afternoon session on the third day of the main conference. What made this experience particularly valuable was the diverse audience at ICML. Our research falls into one of the many specialized areas represented at the conference, which resulted in a wide range of interdisciplinary interactions. Not only did machine learning researchers visit my poster, but I also got to talk to a neuroscience student and people from the tech or finance industry, which helped give a broader perspective on the research; People with different backgrounds are not accustomed to common approaches in our area and may offer a fresh and unbiased perspective. This was in stark contrast to our presentation of the same paper at the non-archival workshop COMSOC 2023, where the audience was more narrowly focused. In my opinion, nurturing both types of interactions is essential for academic growth.

Talks and research that inspire

ICML was the first conference I ever attended, and I had the privilege of attending for the second time this year. What struck me most was is palpable sense of excitement (and worry) that permeates the conference. It always feels like a beacon for research that is leading us straight into the future. The impact of ICML was particularly evident through its invited speakers in fields such as genetics, climate science, and social impact. These speakers provided valuable insights and inspired us to align our research with their respective domains.I was particularly excited to attend a talk by Nobel Prize laureate Jennifer Doudna. She received the Nobel Prize in Chemistry in 2020 for her work on the development of the CRISPR-Cas9 gene-editing technology, jointly with Emmanuelle Charpentier. CRISPR-Cas9 is a groundbreaking gene-editing tool that has revolutionized genetic research and biotechnology. Jennifer Doudna discussed how machine learning research could collaborate with genetics to advance our understanding of genetics and potentially cure genetic diseases, following the successful example of Sickle Cell disease. Similarly, Shakir Mohamed from Google DeepMind addressed the challenges in climate science and how Machine Learning can enhance our understanding and prediction of climate phenomena. These experiences have inspired me to explore the role of machine learning and large language models in the fields of algorithmic game theory and algorithmic social choice in future research. Obtaining elaborate data on people’s preferences can be infeasible due to cognitive constraints and unfortunately, the quality of the decision suffers if it relies on lower quality data. Large language models could potentially serve as intermediaries to extract and predict preferences from interactions, overcoming possibly the major limitation of our research area.

Opportunities and social activities

The conference’s setting in Hawaii added a unique dimension to the experience. Attendees used the ICML app to organize activities such as snorkeling and hiking in the mornings before the conference. After the conference, various meet-ups brought together individuals with shared interests, whether for activities, research discussions, or networking within specific institutions or companies. Additionally, some attendees had the opportunity to attend complimentary dinners hosted by prominent companies, providing a chance to engage in discussions with industry experts and fellow attendees. I had great fun at both the Google Research and Jane Street dinners. My visit ended with trying surfing for the first time with PhD students from all over the globe. I thank AISB for their financial support, which helped make my visit to ICML possible.

Profile picture of Sonja giving a thumbs up infront of a scenic view.

About the Author

Sonja Kraiczy is a PhD Student at Oxford University University supervised by Professor Edith Elkind. Her research lies in the area of algorithmic social choice and algorithmic game theory. Previously, she read for an MSc in Mathematics and Foundations of Computer Science at Oxford University she completed her undergraduate degree in Mathematics and Computer Science at the University of Glasgow.

Conference Reports: Empirical Methods in Natural Language Processing (EMNLP) 2022

Conference Reports: Empirical Methods in Natural Language Processing (EMNLP) 2022

Chen Tang (University of Surrey, chen.tang@surrey.ac.uk)

The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) took place in December between 7^{th} and 11^{th} this year, hosted in Abu Dhabi National Exhibition Centre of United Arab Emirates (UAE). As one of the top-tier conferences in Natural Language Processing (NLP) and Artificial Intelligence (AI), EMNLP has a record attendance number of more than 3000 attendees in this year, of which 1,629 are in-person and 1,316 are virtual.

The first two days consist of 24 workshops and 6 tutorials in a range of NLP fields, and the main conference lasting 3 days started from 9th December. As in the past years, EMNLP has a very competitive acceptance rate for presenting the cutting-edge advances of research and burning issues in NLP. Resulting from the popularity of Deep Learning techniques in recent years, EMNLP received a large volume of submissions (4190) in this year, while the overall acceptance rate of the main conference dropped from 23% of 5 years ago to current 20%. In addition to 829 papers accepted as the long or papers, EMNLP also accepted an additional
549 (13%) papers in Findings.

Alongside the conference in daytime, there were two social events held at the night of 8^{th} and 10^{th}, respectively. One is the Welcome Reception held on Thursday December 8, 2022 from 7:00pm to 10:00pm. The location was the Aloft Abu Dhabi Hotel where the community can meet and make new friends and catch up with colleagues over a drink and light canapes. Another one was held on Saturday December 10, 2022 from 7:00pm — 10:00pm at the Ritz-Carlton Abu Dhabi, Grand Canal Hotel. With its captivating sunsets and extraordinary view of the Sheikh Zayed Grand Mosque, the attendees were able to enjoy an evening out at the Social Event with luxurious foods and drinks, along with Abu Dhabi’s local entertainment, compliments of the Department of Tourism, and a DJ for some evening dancing.

Keynote Lectures

There were 4 speakers invited to give us the their keynotes. Professor Neil Cohn who came from Tilburg University, Department of Communication and Cognition, gave the first talk of The Multimodal Language Faculty and the Visual Languages of Comics. Contrary to the notions of language as an amodal system, natural human communication is multimodal and combines speech, gestures, writing, and pictures. He showed us how the consistent "universal" linguistic principles persist across this structural diversity of different visual languages. He argued that a multi-modal language faculty requires us to change our conception of linguistic relativity, and he also showed how subtle structures of spoken languages permeate across to visual languages.

The second, which impressed me the most, is Nazneen Rajani, the speaker from Hugging Face (abbr. HF) Team. HF is one of the most popular
platform to share AI resources and models. Nazneen is a Research Lead at Hugging Face, a startup with a mission to democratize ML, leading data-centric ML research which involves systematically analyzing, curating, and automatically annotating data. Before HF, she worked at Salesforce Research with Richard Socher and led a team of researchers focused on building robust natural language generation systems based on LLMs. She gave us a macro-level view of how the NLP model landscape has evolved based on their systematic study of 75K HF models. In the second part, she discussed advances, challenges and opportunities in evaluating
and documenting NLP models developed in an industry setting. Based on the results, she lead us to see a paradigm shift from model-centric to
data-centric evaluation and documentation.

Gary Marcus who is an Emeritus Professor of Psychology and Neural Science at NYU, gave us the next talk. He is a successful scientist and also a serial entrepreneur (Founder of Robust.AI and Geometric.AI, acquired by Uber). He shared with us the topic of "Towards a Foundation for AGI", in which he shed light on the analysis of whether the large pretrained language models like GPT-3 and PaLM are adequate as a basis for artificial general intelligence (AGI), and if not, what would a better foundation for general intelligence look like?

Finally, Mona Diab who is the Lead Responsible AI Research Scientist with Meta, gave the last talk of "Towards a Responsible NLP: Walking
the Walk"
. Mona Diab is a very experienced researcher. In addition to her achievement in industry, she is also a full Professor of Computer
Science at the George Washington University (on leave) where she directs the CARE4Lang NLP Lab. Her talk posed a lot of deep questions to us,
e.g., Do we need to re-orient and re-pivot NLP? If so, what is needed to make this happen? Can we chart together a program where we ensure that
science is the pivotal ingredient in CL/NLP? Could Responsible NLP be an avenue that could lead us back towards that goal? In this talk, she
explored some "practical" ideas around framing a responsible NLP vision hoping to achieve a higher scientific standard for our field, addressing
issues from the "how" we conduct our research and venturing into the "what" we work on and produce using tenets from a responsible mindset
perspective.

Tutorials

Tutorials provide great opportunities for NLP researchers to get to know a new area or approach of NLP. The tutorials of EMNLP 2022 includes the
following topic:

  • Meaning representations for natural language
  • Arabic language processing
  • Language-based coordination in deep multi-agent systems
  • Causal inference for NLP
  • Modular and parameter-efficient fine-tuning for NLP models
  • Non-autoregressive model for fast sequence generation

Tutorials are all recorded and expected to be released in the future. You can check out details of tutorials and their speaker on this page.

Workshops and Volunteer

It was our honor to be invited to present our work in the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) and the 3rd Workshop on Figurative Language Processing (FLP) held on 7th and 8th December respectively. In the GEM workshop, we found out a variety of outstanding state-of-the-art works on Natural Language Generation (NLG), which is one of the most active research fields within NLP. Wegot the chance to make the acquaintance of lots of leading groups from top universities such as Cambridge, Oxford, Stanford, Carnegie Mellon, and so on. The authors from these leading groups illustrated us their work in detail with proficient but comprehensible instructions. We got a lot of inspirations via discussing current advances in their research
directions. The FLP workshop was also quite inspiring and covered a number of promising directions including processing of metaphor, euphemisms and other types of figurative languages.

On the second day (8^{th}), as volunteers we took responsibility for the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP). We had a keynote (Knowledge-Based News Event Analysis and Forecasting) from Dr. Oktie Hassanzadeh, a Senior Research Staff Member at IBM T.J. Watson Research Center, and an overview of recent FinNLP studies from the FinNLP organizer. We also got the chance to make friend with the host of FinNLP, Chung-Chi Chen who is a researcher from Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Japan. He did plenty of works in finance-oriented NLP and information retrieval area, and he also invited me to visit him in Tokyo, Japan for collaboration on Finance and NLG.

Main Conference Papers

In the main conference, there were 175 oral presentations and 654 posters in total within 3 days. Presentations are mainly divided into following tracks:

  1. Natural Language Generation
  2. Resources and Evaluation
  3. Semantics
  4. Summarization
  5. Industry
  6. NLP Applications
  7. Language Modeling and Analysis of Language Models
  8. Sentiment, Stylistic Analysis, Argument Mining & Discourse
  9. Speech, Vision, Robotics, Multimodal Grounding
  10. Question Answering
  11. Ethics & Computational Social Science and Cultural Analytics
  12. Dialog and Interactive Systems
  13. Multilinguality
  14. Efficient Methods for NLP
  15. Information Retrieval and Text Mining
  16. Interpretability, Interactivity, and Analysis of Models for NLP
  17. Machine Learning for NLP
  18. Machine Translation
  19. Commonsense Reasoning
  20. Unsupervised and Weakly Supervised Methods
  21. Morphology, Syntax, Linguistics, Psycholinguistics

Because of the large volume of presentations, it was hard to attend all of them. My research area is natural language generation, so we mainly joined the tracks related to natural language generation including 1, 4, 6, 10, 12, and 17.

Best paper:

The best paper award of EMNLP this year goes to "Abstract Visual Reasoning with Tangram Shapes" for the long paper, and "Topic-Regularized Authorship Representation Learning" for the short paper.

We really appreciate AISB awarded us a Travel Award to help with the cost of attendance. We are very grateful to get a chance to attend this
top-tier conference in person where we communicate with various excellent academic peers, and we believe this precious experience will contribute to our future research. The network built in this event allows us to better get involved with cutting-edge AI researches from all around the world.