Xinyi's projects

Xinyi is currently ~~taking an excursion outside academia~~ a full time Python Core Developer in a global investment bank in London.

Research

Topics in Graph Colouring

We study two variants of graph (vertex) colourings: multi-colouring and correspondence colouring. More specifically, the following topics are studied.
1. For given n, k and n', k', if a graph is (n,k)-colourable, then what is the largest subgraph of it that is (n',k')-colourable?
2. For given n, k, if a graph is (n,k)-colourable, then for what n', k' is the whole graph (n',k')-colourable?
3. For given n and n', how many vertices of a n-correspondence-colourable graph can always be properly correspondence-coloured with arbitrary correspondences and arbitrary n'-colour-lists on that graph?
4. How different graph operations affect the correspondence chromatic number of multigraphs?
For definitions and results, see the original thesis.

PhD Thesis

Multi-Colouring of Graphs: towards Stahl’s Conjecture

In ordinary graph colouring, each vertex receives a colour. Such a colouring is proper if adjacent vertices receive different colours. In k-multi-colouring, each vertex receives a set of k colours, and such a multi-colouring is proper if adjacent vertices receive disjoint set of colours. A graph is (n,k)-colourable if there is a proper k-multi-colouring of it using n colours.
For given n, k, if a graph is (n,k)-colourable, then for what n', k' is the whole graph (n',k')-colourable?
We first observe how it can be reformulated into a conjecture by Stahl from 1976 regarding Kneser graphs, and prove new results towards Stahl's conjecture.
Joint work with Jan van den Heuvel. Research talks on this project has been given at BCC 2022, ICGT 2022 and Shandong University Qi Lu Forum.

arXiv

n-Colourable Induced Subgraphs

In 2000, Albertson, Grossman, and Haas asked the following question: given a graph G with list chromatic number n, if each vertex has been assigned a list with n' colours, 1 ≤ n' ≤ n, can we always properly colour at least n'|V(G)|/n vertices of G?
We study the partial colouring problem associated with other variants of graph colouring and answer the following negatively with a series of examples: for any fractional-r-colourable graph G and s < r, does it always have a fractional-s-colourable subgraph of at least r|V(G)|/s vertices? Similar properties are also observed on multiple colouring. Then we study the following.
(a) Given a rational number r > 0 and integer n ≥ 1, for what real number a can we guarantee that every fractional-r-colourable graph G has an n-colourable induced subgraph with at least a|V(G)| vertices?
(b) Given rational numbers r, s > 0, for what real number b can we guarantee that every fractional-r-colourable graph G has a fractional-s-colourable induced subgraph with at least b|V(G)| vertices?
Joint work with Jan van den Heuvel. Research talks on this project has been given at BCC 2021.
For definitions and results, see Chapter 3 of the above thesis.

Partial Multi-Colouring and Fractional Colouring

For given n, k and n', k', if a graph is (n,k)-colourable, then what is the largest subgraph of it that is (n',k')-colourable?
This question is inspired by a partial colouring conjecture asked by Albertson, Grossman, and Haas in 2000 regarding list colouring. We obtain exact answers for specific values of the parameters, and upper and lower bounds on the largest (n',k')-colourable subgraph for general values of n', k'.
Joint work with Jan van den Heuvel. Research talks on this project has been given at BCC 2021.
For definitions and results, see Chapter 4 of the above thesis.

Partial Correspondence Colouring

In correspondence colouring, each vertex is associated with a prespecified list of colours, and there is prespecified correspondence associated with each edge specifying which pair of colours from the two endvertices correspond. A correspondence colouring is proper if each vertex receives a colour from its prespecified list, and that for each edge, the colours on its endvertices do not correspond. A graph is n-correspondence-colourable if a proper correspondence colouring exist for any prespecified correspondences on any prespecified n-colour-lists associated to each vertex.
As correspondence colouring is a generalisation of list colouring, it is natural to ask whether Albertson, Grossman, and Haas' conjecture can be generalised to correspondence colouring. Unfortunately, there are graphs on which their conjectured value does not hold, and we will present a series of them.
We then study: for given n and n', how many vertices of a n-correspondence-colourable graph can always be properly correspondence-coloured with arbitrary correspondences and arbitrary n'-colour-lists on that graph? We generalise some results from the original conjecture in list colouring. Then we discuss some sufficient conditions for a proper correspondence colouring to exist.
For definitions and results, see Chapter 6 of the above thesis.

The Effect of Graph Operations on Correspondence Colouring

As a variant of vertex colouring of graphs, correspondence colouring generalises ordinary and list colouring. Whilst multiple edges do not provide additional restrictions in ordinary or list colouring, additional constraints in correspondence colouring appear when multiple edges are present. We study how certain graph operations affect the correspondence chromatic number of multigraphs and present an interesting result that adding one edge can increase the correspondence chromatic number by arbitrarily large.
Joint work with Jan van den Heuvel. Research talks on this project has been given at LSE, Shandong University and BYMAT 2020.
For detailed definitions and results, see Chapter 5 of the above thesis.

Counting Independent Sets in Graphs

We study an algorithm introduced by Kleitman and Winston. This algorithm encodes independent sets of a given graph in an invertible manner, which allows us to bound number of independent sets in graphs. In particular, we study its applications on counting sum-free sets, counting independent sets in certain types of graphs, counting C₄-free graphs with fixed vertex set and proving Roth’s theorem.

This is a final-year (of the 4-year Msci programme) project at UCL. The project was completed under the supervision of Dr John Talbot.

Teaching

(I meant, proper university teaching.)

Algorithms and Data Structures

Xinyi is teaching Algorithms and Data Structure classes at LSE in academic year 2021/2022.

Introduction to Abstract Mathematics

Xinyi taught Introduction to Abstract Mathematics classes at LSE in academic years 2019/2020 and 2020/2021, and its half-unit course, Mathematical Proof and Analysis in academic year 2021/2022.

Further Quantitative Methods

Xinyi taught Further Quantitative Methods (Matheatics) classes as LSE in academic year 2018/2019.

Modelling & Coding

(Well... Of course you would not expect me to list company project here right ;))

Data Open Global Championship 2021

As the third place winner of Europe Regional Data Open, we were invited to attend this Gloabl Championship in November 2021.

Plastic revolutionised modern life in many ways. Despite the benefits it brings to human life as well as scientific development, increasing production, consumption and disposal of plastic products is becoming one of the major concerns of global environmental health.

In a group of four, we investigated the historical and forecasted end-of-life processing of international plastic trade through a gravity model. In particular, we verified that current trading flows exacerbate the emission of end-to-life plastics, especially in countries that do not possess and/or enforce sufficiently responsible waste processing protocols. Our study observes countries with great management being the exporters of plastics whereas the distribution (of management level) of importing countries is more varied. However, it is evident that there has been a trend of importing countries with lesser waste management capabilities possessing a larger share of the overall trade imports. This dynamic is even more evident when specifically looking at plastic waste trading. We then estimate how changes in the growth for countries will transform the current configuration and made suggestions on improving this situation pragmatically and realistically.

Data Science for All Women's Summit 2021

Xinyi was selected to participant in a rigorous 7-week data science training program focused on building skills in data science, culminating in a capstone presentation.

Banks and bank performance are closely tied to the overall economic health of a country. Bank failures can lead to lower income and compensation growth, higher poverty rates, and lower employment. It is therefore important to understand the factors that may cause a bank to fail. Our project investigates whether it is possible to predict bank failure, and how early we can start seeing signs of failure.

Collaborated with a team of seven, we were able to predict bank failure (on average) five quarters in advance using a Random Forest model. The success and failure of each US commercial bank in each quarter over the last 25 years were classified using publicly available data on US banks mergers and acquisitions and commercial banks' quarterly call reports.

Datafolio Presentation

CodeIT Suisse Singapore 2021

In a team of two, we wrote Python programs and built server using Heroku App to perform certain tasks and in a 24-hour time frame.

See code sample for one problem solely written by Xinyi (more samples written collaboratively as a team, including Git Repo, Heroku App is available upon request).

Code Sample

Terminal - Europe Regional Competition 2021

In a group of three, we developed an AI algorithm to play the Terminal game and defeated more than half of pre-selected teams, where the selection process was already competitive.

Our algorithm was also ranked top 50 in Global competition Season 8, as of July 2021.

Modelling Camp 2021

Asked by the company, we decide (between different strategies) how to minimise the fuel cost if two lorries were sent to deliver vaccines to n cities, while only one of the drivers is certified to distribute vaccines. (I.e. the other driver can only drive and deliver.)

This is a 4-day virtual workshop bringing PhD students from different universities together.

Organiser: ICMS and MAC-MIGS. See the event page here.

Presentation

Europe Regional Data Open 2020 - Third Place Winner

The Europe Regional Data Open 2020 is an one-week online competition in October 2020. Under the topic of gentrification in the New York Metropolitan area, we focused on identifying what made some tracts being able to develop and absorb outside investments without replacing original residents.

With a logistic regression model using Python, we studied the relationship between dominating industry in a tract and its behaviour when outside investment came in. Our study advises a list of tracts that will absorb investment with the least population displacement, which suggests investors on possible ethical and sustainable investments in the city, in the spirit of developing without destroying culture diversity.

We won the third place and our datafolio was referred as the best datafolio amongst all participants.
Teammates: Pei Dong, Joel Perez Ferrer, and Sean Nassimiha.

Organiser: Citadel and Correlation One. Data: Census Bureau’s American Community Survey (2009-2018).

Datafolio Certificate

Projects and Teaching