Universal Streaming of Subset Norms

by Vladimir Braverman, Robert Krauthgamer, and Lin F. Yang

Theory of Computing, Volume 18(20), pp. 1-32, 2022

Bibliography with links to cited articles

[1]   Noga Alon, Nick Duffield, Carsten Lund, and Mikkel Thorup: Estimating arbitrary subset sums with few probes. In Proc. 24th Symp. on Principles of Database Systems (PODS’05), pp. 317–325. ACM Press, 2005. [doi:10.1145/1065167.1065209]

[2]   Noga Alon, Yossi Matias, and Mario Szegedy: The space complexity of approximating the frequency moments. J. Comput. System Sci., 58(1):137–147, 1999. Preliminary version in STOC’96. [doi:10.1006/jcss.1997.1545]

[3]   Alexandr Andoni: High frequency moments via max-stability. In Proc. 42nd Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP’17), pp. 6364–6368. IEEE Comp. Soc., 2017. [doi:10.1109/ICASSP.2017.7953381]

[4]   Alexandr Andoni, Jiecao Chen, Robert Krauthgamer, Bo Qin, David P. Woodruff, and Qin Zhang: On sketching quadratic forms. In Proc. 7th Innovations in Theoret. Comp. Sci. Conf. (ITCS’16), pp. 311–319. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016. [doi:10.1145/2840728.2840753, arXiv:1511.06099]

[5]   Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak: Streaming algorithms via precision sampling. In Proc. 52nd FOCS, pp. 363–372. IEEE Comp. Soc., 2011. [doi:10.1109/FOCS.2011.82]

[6]   Ziv Bar-Yossef, T. S. Jayram, Robert Krauthgamer, and Ravi Kumar: The sketching complexity of pattern matching. In Proc. 8th Internat. Workshop on Randomization and Computation (RANDOM’04), pp. 261–272. Springer, 2004. [doi:10.1007/978-3-540-27821-4_24]

[7]   Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar: An information statistics approach to data stream and communication complexity. J. Comput. System Sci., 68(4):702–732, 2004. Preliminary version in FOCS’02. [doi:10.1016/j.jcss.2003.11.006]

[8]   Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, and Luca Trevisan: Counting distinct elements in a data stream. In Proc. 6th Internat. Workshop on Randomization and Computation (RANDOM’02), pp. 1–10. Springer, 2002. [doi:10.1007/3-540-45726-7_1]

[9]   Lakshminath Bhuvanagiri, Sumit Ganguly, Deepanjan Kesh, and Chandan Saha: Simpler algorithm for estimating frequency moments of data streams. In Proc. 17th Ann. ACM–SIAM Symp. on Discrete Algorithms (SODA’06), pp. 708–713. SIAM, 2006. [doi:10.1145/1109557.1109634]

[10]   Jarosław Błasiok: Optimal streaming and tracking distinct elements with high probability. ACM Trans. Algorithms, 16(1):3:1–28, 2020. Preliminary version in SODA’18. [doi:10.1145/3309193, arXiv:1804.01642]

[11]   Jarosław Błasiok, Vladimir Braverman, Stephen R. Chestnut, Robert Krauthgamer, and Lin F. Yang: Streaming symmetric norms via measure concentration. In Proc. 49th STOC, pp. 716–729. ACM Press, 2017. [doi:10.1145/3055399.3055424, arXiv:1511.01111]

[12]   Mark Braverman, Ankit Garg, Denis Pankratov, and Omri Weinstein: Information lower bounds via self-reducibility. Theory Computing Sys., 59(2):377–396, 2016. Preliminary version in CSR’13. [doi:10.1007/s00224-015-9655-z, ECCC:TR12-177]

[13]   Vladimir Braverman and Stephen R. Chestnut: Universal sketches for the frequency negative moments and other decreasing streaming sums. In Proc. 19th Internat. Workshop on Randomization and Computation (RANDOM’15), pp. 591–605. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2015. [doi:10.4230/LIPIcs.APPROX-RANDOM.2015.591]

[14]   Vladimir Braverman, Stephen R. Chestnut, Nikita Ivkin, Jelani Nelson, Zhengyu Wang, and David P. Woodruff: BPTree: An 2 heavy hitters algorithm using constant memory. In Proc. 36th Symp. on Principles of Database Systems (PODS’17), pp. 361–376. ACM Press, 2017. [doi:10.1145/3034786.3034798, arXiv:1603.00759]

[15]   Vladimir Braverman, Stephen R. Chestnut, David P. Woodruff, and Lin F. Yang: Streaming space complexity of nearly all functions of one variable on frequency vectors. In Proc. 35th Symp. on Principles of Database Systems (PODS’16), pp. 261–276. ACM Press, 2016. [doi:10.1145/2902251.2902282, arXiv:1601.07473]

[16]   Vladimir Braverman, Jonathan Katzman, Charles Seidell, and Gregory Vorsanger: An optimal algorithm for large frequency moments using O(n1-2∕k) bits. In Proc. 18th Internat. Workshop on Randomization and Computation (RANDOM’14), pp. 531–544. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2014. [doi:10.4230/LIPIcs.APPROX-RANDOM.2014.531]

[17]   Vladimir Braverman and Rafail Ostrovsky: Smooth histograms for sliding windows. In Proc. 48th FOCS, pp. 283–293. IEEE Comp. Soc., 2007. [doi:10.1109/FOCS.2007.55]

[18]   Vladimir Braverman and Rafail Ostrovsky: Zero-one frequency laws. In Proc. 42nd STOC, pp. 281–290. ACM Press, 2010. [doi:10.1145/1806689.1806729]

[19]   Vladimir Braverman, Rafail Ostrovsky, and Alan Roytman: Zero-one laws for sliding windows and universal sketches. In Proc. 19th Internat. Workshop on Randomization and Computation (RANDOM’15), pp. 573–590. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2015. [doi:10.4230/LIPIcs.APPROX-RANDOM.2015.573]

[20]   Vladimir Braverman, Emanuele Viola, David P. Woodruff, and Lin F. Yang: Revisiting frequency moment estimation in random order streams. In Proc. 45th Internat. Colloq. on Automata, Languages, and Programming (ICALP’18), pp. 25:1–14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018. [doi:10.4230/LIPIcs.ICALP.2018.25, arXiv:1803.02270]

[21]   Amit Chakrabarti, Subhash Khot, and Xiaodong Sun: Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In Proc. 18th IEEE Conf. on Comput. Complexity (CCC’03), pp. 107–117. IEEE Comp. Soc., 2003. [doi:10.1109/CCC.2003.1214414]

[22]   Amit Chakrabarti and Oded Regev: An optimal lower bound on the communication complexity of GAP-HAMMING-DISTANCE. SIAM J. Comput., 41(5):1299–1317, 2012. [doi:10.1137/120861072]

[23]   Moses Charikar, Kevin Chen, and Martin Farach-Colton: Finding frequent items in data streams. Theoret. Comput. Sci., 312(1):3–15, 2004. Preliminary version in ICALP’02. [doi:10.1016/S0304-3975(03)00400-6]

[24]   Edith Cohen, Nick Duffield, Haim Kaplan, Carstent Lund, and Mikkel Thorup: Algorithms and estimators for summarization of unaggregated data streams. J. Comput. System Sci., 80(7):1214–1244, 2014. [doi:10.1016/j.jcss.2014.04.009]

[25]   Graham Cormode and S. Muthukrishnan: Combinatorial algorithms for compressed sensing. In Proc. 13th Internat. Colloq. on Structural Information and Communication Complexity (SIROCCO’06), pp. 280–294. Springer, 2006. [doi:10.1007/11780823_22]

[26]   Nick Duffield, Carsten Lund, and Mikkel Thorup: Priority sampling for estimation of arbitrary subset sums. J. ACM, 54(6):32, 2007. [doi:10.1145/1314690.1314696]

[27]   Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang: On graph problems in a semi-streaming model. Theoret. Comput. Sci., 348(2–3):207–216, 2005. Preliminary version in ICALP’04. [doi:10.1016/j.tcs.2005.09.013]

[28]   Philippe Flajolet and G. Nigel Martin: Probabilistic counting algorithms for data base applications. J. Comput. System Sci., 31(2):182–209, 1985. [doi:10.1016/0022-0000(85)90041-8]

[29]   Arpit Gupta, Rob Harrison, Marco Canini, Nick Feamster, Jennifer Rexford, and Walter Willinger: Sonata: Query-driven streaming network telemetry. In Proc. 31st SIG Data Communication Conf. (SIGCOMM’18), pp. 357–371. ACM Press, 2018. [doi:10.1145/3230543.3230555, arXiv:1705.01049]

[30]   Sariel Har-Peled and Soham Mazumdar: On coresets for k-means and k-median clustering. In Proc. 36th STOC, pp. 291–300. ACM Press, 2004. [doi:10.1145/1007352.1007400]

[31]   Piotr Indyk: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM, 53(3):307–323, 2006. Preliminary version in FOCS’00. [doi:10.1145/1147954.1147955]

[32]   Piotr Indyk and David P. Woodruff: Tight lower bounds for the distinct elements problem. In Proc. 44th FOCS, pp. 283–288. IEEE Comp. Soc., 2003. [doi:10.1109/SFCS.2003.1238202]

[33]   Piotr Indyk and David P. Woodruff: Optimal approximations of the frequency moments of data streams. In Proc. 37th STOC, pp. 202–208. ACM Press, 2005. [doi:10.1145/1060590.1060621]

[34]   T. S. Jayram and David P. Woodruff: Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with subconstant error. ACM Trans. Algorithms, 9(3):26:1–17, 2013. Preliminary version in SODA’11. [doi:10.1145/2483699.2483706]

[35]   Daniel M. Kane, Jelani Nelson, Ely Porat, and David P. Woodruff: Fast moment estimation in data streams in optimal space. In Proc. 43rd STOC, pp. 745–754. ACM Press, 2011. [doi:10.1145/1993636.1993735, arXiv:1007.4191]

[36]   Daniel M. Kane, Jelani Nelson, and David P. Woodruff: On the exact space complexity of sketching and streaming small norms. In Proc. 21st Ann. ACM–SIAM Symp. on Discrete Algorithms (SODA’10), pp. 1161–1178. SIAM, 2010. [doi:10.1137/1.9781611973075.93]

[37]   Daniel M. Kane, Jelani Nelson, and David P. Woodruff: An optimal algorithm for the distinct elements problem. In Proc. 29th Symp. on Principles of Database Systems (PODS’10), pp. 41–52. ACM Press, 2010. [doi:10.1145/1807085.1807094]

[38]   Ilan Kremer, Noam Nisan, and Dana Ron: On randomized one-round communication complexity. Comput. Complexity, 8(1):21–49, 1999. Preliminary version in STOC’95, Errata in Comput. Complexity 10 (2001), 314–315. [doi:10.1007/s000370050018]

[39]   Zaoxing Liu, Antonis Manousis, Gregory Vorsanger, Vyas Sekar, and Vladimir Braverman: One sketch to rule them all: Rethinking network flow monitoring with univmon. In Proc. 29th SIG Data Communication Conf. (SIGCOMM’16), pp. 101–114. ACM Press, 2016. [doi:10.1145/2934872.2934906]

[40]   Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson: On data structures and asymmetric communication complexity. J. Comput. System Sci., 57(1):37–49, 1998. Preliminary version in STOC’95. [doi:10.1006/jcss.1998.1577]

[41]   Srinivas Narayana, Anirudh Sivaraman, Vikram Nathan, Prateesh Goyal, Venkat Arun, Mohammad Alizadeh, Vimalkumar Jeyakumar, and Changhoon Kim: Language-directed hardware design for network performance monitoring. In Proc. 30th SIG Data Communication Conf. (SIGCOMM’17), pp. 85–98. ACM Press, 2017. [doi:10.1145/3098822.3098829]

[42]   Jelani Nelson: List of open problems in sublinear algorithms: Problem 30. sublinear.info, accessed 2018-06-20.

[43]   Norbert Sauer: On the density of families of sets. J. Combin. Theory–A, 13(1):145–147, 1972. [doi:10.1016/0097-3165(72)90019-2]

[44]   Vyas Sekar, Nick G. Duffield, Oliver Spatscheck, Jacobus E. van der Merwe, and Hui Zhang: LADS: Large-scale automated DDoS detection system. In Proc. 12th USENIX Annual Technical Conference, General Track (UATC’06), pp. 171–184. USENIX Association, 2006. USENIX.

[45]   Vyas Sekar, Michael K. Reiter, and Hui Zhang: Revisiting the case for a minimalist approach for network flow monitoring. In Proc. 10th SIGCOMM Internet Measurement Conf. (IMC’10), pp. 328–341. ACM Press, 2010. [doi:10.1145/1879141.1879186]

[46]   Saharon Shelah: A combinatorial problem; stability and order for models and theories in infinitary languages. Pacific J. Math, 41(1):247–261, 1972. [doi:10.2140/pjm.1972.41.247]

[47]   Anshumali Shrivastava, Arnd Christian Konig, and Mikhail Bilenko: Time adaptive sketches (ada-sketches) for summarizing data streams. In Proc. 41st Internat. Conf. on Management of Data (SIGMOD’16), pp. 1417–1432. ACM Press, 2016. [doi:10.1145/2882903.2882946]

[48]   Mario Szegedy: The DLT priority sampling is essentially optimal. In Proc. 38th STOC, pp. 150–158. ACM Press, 2006. [doi:10.1145/1132516.1132539]

[49]   Daniel Ting: Data sketches for disaggregated subset sum and frequent item estimation. In Proc. 44th Internat. Conf. on Management of Data (SIGMOD’18), pp. 1129–1140. ACM Press, 2018. [doi:10.1145/3183713.3183759, arXiv:1709.04048]

[50]    David Vengerov, Andre Cavalheiro Menck, Mohamed Zait, and Sunil P. Chakkappen: Join size estimation subject to filter conditions. Proc. VLDB Endowment, 8(12):1530–1541, 2015. [doi:10.14778/2824032.2824051]

[51]   David P. Woodruff: Optimal space lower bounds for all frequency moments. In Proc. 15th Ann. ACM–SIAM Symp. on Discrete Algorithms (SODA’04), pp. 167–175. SIAM, 2004. ACM DL.