Michael Shekelyan's Personal Webpage

Academic Profiles

	google scholar [link]
	orcid [link]
	dblp [link]

Academic Journey compsci.science

Teaching & Research Positions

Queen Mary University of London
2023-: Lecturer (Assist. Prof.)

Research Positions

King's College London
2021-2023: Research Associate
University of Warwick
2018-2021: Research Fellow

Education

Libera Università di Bolzano
2018: PhD in Computer Science
University of Munich
2014: Diploma in Media Informatics

Academic Services

Meta-Reviewer / Conference Officer:
- NeurIPS Area Chair (2024)
- ICDT Proceedings Chair (2024)
Reviewer (Machine Learning):
- ICML (2022, 2023, 2024)
- NeurIPS (2022, 2023)
- ICLR (2024, 2025)
- AAAI (2025)
- AISTATS (2021, 2023)
Reviewer (Data Mining):
- ACM SIGKDD (2022)
- ACM WSDM (2022)
- IEEE ICDM (2021)
- IEEE DSAA (2023)
- Data Min. Knowl. Discov. (2022)
- Inf. Sci. Journal (2020)
Reviewer (Computer Security):
- IEEE T-IFS Journal (2021)
- IEEE TDSC Journal (2021, 2022)
Reviewer (Distributed Systems):
- ACM DEBS (2021)
Reviewer (Data Management):
- ACM SIGMOD (2020, 2021)
- IEEE ICDE (2016, 2021, 2022)
- EDBT (2020)
- SSDBM (2018)
- VLDB Journal (2018, 2019, 2022, 2023, 2024)
- TKDE Journal (2018, 2021, 2022)

Publications as Lead Author

Random Sampling

EDBT'23 [link, preprint, pdf]
AISTATS'21 [link, pdf]

Data Summarisation

IEEE ICDE'21 [link, pdf]
ACM PODS'21 [link, pdf]
PVLDB'17 [link, slides, pdf]

Sparse Prefix Sums

Information Systems'19 [link]
ADBIS'17 [award, slides, link]

Multiobjective Shortest Path

IEEE ICDE'15 [link, pdf]
SSTD'15 [link, pdf]
DASFAA'14 [link, pdf]

Publications as Co-Author

String Processing

SPIRE'23 [link, pdf]

Join Sampling

ACM SIGMOD'21 [link]

Image Similarity Search

SPIE'12 [link, pdf]

Bio

I was born in Moscow, but I grew up in Hamburg and then later moved to Munich where I studied and worked with Prof. Matthias Schubert and Prof. Hans-Peter Kriegel's database group (University of Munich). I did my PhD in Italy under the supervision of Prof. Johann Gamper (Libera Università di Bolzano) and then went to the UK for postdoctoral research under Prof. Graham Cormode (University of Warwick) & Dr. Grigorios Loukides (King's College London) followed by an appointment as Lecturer in Computer Science (Queen Mary University of London).

Research

My research focuses primarily on algorithms, data structures and summaries to manage very large or sensitive data. The overall goal is to build a full data pipeline that feeds end users with easily interpretable facts which provide novel insights and aid decision making processes. Reducing the data complexity either through sampling or summarisation plays a crucial role to support exploratory interactions with the data that involve a lot of probing, while still providing an intuitive approximation model of the data. Sensitive data calls for privacy-preserving techniques such as differential privacy & federated learning to facilitate data sharing between organisations whilst minimising risks to the privacy of patients, users, customers and employees whose personal information is collected.

Differential Privacy

How to select the top items based on sensitive scores in a privacy-preserving manner:

Shekelyan & Loukides
Differentially Private Top-k Selection via Canonical Lipschitz Mechanism
ArXiv (2022) [preprint, project, pdf]
- referenced in:
      IEEE ICDCS [2023]
      IEEE Transactions on Mobile Computing [2024]
      arXiv [2024, 2024b]

Sampling

How to directly jump along the selected positions of a simple random sample storing only a handful of values

Python code for sampling iterator :

Shekelyan & Cormode
Sequential Random Sampling Revisited: Hidden Shuffle Method
AISTATS (2021) [conference, project, bibtex, pdf]

referenced in:
    Journal of Applied Genetics [2023]
    Entropy [2022]
    LIPIcs CPM [2024]
    IEEE INFOCOM [2024]

How to collect a (weighted) random sample over a huge table that is only available as a set of smaller linked tables that need to be joined together (requiring just one pass over most troublesome table):

Shekelyan, Cormode, Ma, Shanghooshabad & Triantafillou
Streaming Weighted Sampling over Join Queries
EDBT (2023) [conference, preprint, pdf]

referenced in:
    Dissertation, University of Warwick [2022]
    PVLDB [2023]
    DEXA [2024]
    Modern Pathology [2024]
    Complex Networks & Their Applications [2023]
    Computational Intelligence and Neuroscience [2022]
    arXiv [2022b, 2022c]

Shany came up with the really cool idea of posing join sampling via probabilistic graphical models:

Shanghooshabad, Kurmanji, Ma, Shekelyan, Almasi & Triantafillou
PGMJoins: Random Join Sampling with Graphical Models
ACM SIGMOD (2021) [conference, bibtex, link]

referenced in:
    Dissertation, University of Minnesota [2022]
    Technical Report, Oregon State University [2022]
    ACM SIGMOD [2022, 2023, 2024, 2024b]
    ACM PODS [2023]
    PVLDB [2023]
    EDBT [2023a]
    IEEE TKDE [2024]
    ACM Management of Data [2023, 2024]
    ACM EdgeSys [2022]
    ACM SoCC [2023]
    ACM HILDA [2023]
    arXiv [2022b, 2022c, 2023]

Multidimensional Data Summaries

How to build tiny data models that empirically tend to be good at approximating the number of points in a rectangular range

DigitHist summary of spatial data
(zoomed in on UK and Germany) :

Shekelyan, Dignoes & Gamper
DigitHist: a Histogram-Based Data Summary with Tight Error Bounds
PVLDB (2017) [conference, link, slides, bibtex, pdf]

referenced in:
    Dissertation, Technical University of Munich [2020, 2023]
    Dissertation, University of Edinburgh [2022]
    Dissertation, University of Mannheim [2022]
    Dissertation, Hong Kong Polytechnic University [2019]
    Dissertation, Indian Institute of Science [2019]
    ICLR [2024]
    PVLDB [2018, 2019, 2020, 2024]
    IEEE ICDE [2021, 2021b, 2021c]
    EDBT [2023b]
    CIDR [2019]
    IEEE TKDE [2019, 2023]
    Data Science and Engineering [2018]
    Knowledge and Information Systems [2020, 2021]
    Information Systems [2022]
    IEEE Transactions on Industrial Informatics Systems [2024]
    Journal of Information Processing [2024]
    arXiv [2023, 2024]

How to build compact data models that are theoretically guaranteed to be good at approximating the number of points in a rectangular range (not just asymptotically!):

Shekelyan, Dignoes, Gamper & Garofalakis
Approximating Multidimensional Range Counts with Maximum Error Guarantees
IEEE ICDE (2021) [conference, bibtex, pdf]

referenced in:
Information Systems [2022]

How to approximate arbitrary rectangles with a few pre-selected rectangles :

Cormode, Garofalakis & Shekelyan (alphabetically ordered)
Data-Independent Space Partitionings for Summaries
ACM PODS (2021) [conference, project, bibtex, pdf]

referenced in:
EDBT [2022]
IEEE WIECON-ECE [2021]

Query Processing

How to compute sums over sub-tables for a very large table of numbers, most of which are equal to zero :

Shekelyan, Dignoes & Gamper
Sparse prefix sums: Constant-time range sum queries over sparse multidimensional data cubes
INFORMATION SYSTEMS (2019) [journal, slides, bibtex, link]
- referenced in:
  Nucleic Acids Research [2024]
  arXiv [2021]

How to find all paths between two network nodes that could be best for some user preference

Optimality for some linear scalarization :

Shekelyan, Josse & Schubert
Linear path skylines in bicriteria networks
DASFAA (2014) [conference, link, project, bibtex, pdf]

referenced in:
    Dissertation, University of Munich [2016, 2016b]
    Dissertation, University of Alberta [2017, 2020]
    Dissertation, Technical University of Dortmund [2018]
    IEEE ICDE [2015, 2015b]
    ACM SIGSPATIAL [2017, 2017b, 2017c, 2020, 2020b]
    SSTD [2015, 2015b, 2015c, 2017]
    IEEE MDM [2020, 2021]
    EMO [2017]
    VEHITS [2016]
    IPSI Bgd Transactions on Internet Research [2024]
    Journal of Internet Technology [2019]
    IET Intelligent Transport Systems [2019]
    Journal of Big Data [2023]
    Information Systems [2016]
    ACM Transactions on Spatial Algorithms and Systems [2020]
    IPSI Transactions on Internet Research [2024]
    Geoinformatica [2017, 2018]

Shekelyan, Josse & Schubert
ParetoPrep: Efficient Lower Bounds for Path Skylines and Fast Path Computation
SSTD (2015) [conference, link, project, bibtex, pdf]

referenced in:
    Dissertation, University of Munich [2016, 2016b]
    Dissertation, Technical University of Dortmund [2018]
    ACM SIGSPATIAL [2017c]
    ACM SIGSPATIAL IWCTS workshop [2024]
    SSTD [2015b, 2015c, 2015d]
    EMO [2017]
    Geoinformatica [2017]

Shekelyan, Josse & Schubert
Linear path skylines in multicriteria networks
IEEE ICDE (2015) [conference, link, project, bibtex, pdf]

referenced in:
    Dissertation, University of Munich [2016]
    Dissertation, University of Technology Sydney [2019]
    Dissertation, New Mexico State University [2021]
    Dissertation, Université de Bordeaux [2021]
    IEEE ICDE [2019, 2020]
    EDBT [2018]
    ACM SIGSPATIAL [2015, 2017c, 2018, 2024]
    ACM SIGSPATIAL IWCTS workshop [2023]
    SSTD [2015b, 2015c]
    DASFAA [2018, 2023]
    IEEE MDM [2021]
    IEEE HPCC / SmartCity / DSS [2016]
    IEEE LifeTech [2021]
    IEEE Transactions on Spatial Algorithms and Systems [2024]
    ATMOS [2020]
    Mathematical Problems in Engineering [2018]
    Geoinformatica [2017]

Websites

How do we turn computer "science" into computer science? [link]

How do we fix peer-review? [link]

How do we get fewer papers with more quality? [link]

London Nightvoucher Project

Currently just an idea born out of my own experiences living in London. I am still learning more about the intricacies involved and potential stumbling blocks ahead, but let me know if you are in any way interested in making it easier to make dedicated donations towards accommodation for people sleeping rough. More details can be found on the project website [nightvoucher.org.uk].

Note: The views and opinions expressed on this site are those of the authors and do not necessarily reflect the official policy or position of their employers. [back]