Selected publications

ICPP15 J.D. Garvey and T.S. Abdelrahman, "Automatic performance tuning of stencil computations on GPUs," Proc. of Int'l Conference on Parallel Processing (ICPP), Beijing, China, September 2015.
ISCA15 C. Segulja and T.S. Abdelrahman, "CLEAN: a race detector with cleaner semantics," Proc. of the ACM/IEEE Int'l Symposium on Computer Architecture (ISCA), Portland, Oregon, June 2015.
CF15 A. Chiu, J. Garvey and T.S. Abdelrahman, "Genesis: A language for generating synthetic training programs for machine learning," Proc. of the ACM Int'l Conference on Computing Frontiers (CF), Ischia, Italy, May 2015.
ADAPT15 T.D. Han and T.S. Abdelrahman, "Automatic Tuning of Local Memory Use on GPGPUs," Proc. of the fifth Int'l Workshop on Adaptive Self-Tuning Computing Systems (ADAPT), Amsterdam, Netherlands, January 2015.
FPL14 D. Capalija and T.S. Abdelrahman, "Tile-based bottom-up compilation of custom mesh-of-FUs FPGA overlays," Proc. of the Int'l Conference on Field Programmable Logic and Applications (FPL), Munich, Germany, September 2014.
PACT14 C. Segulja and T.S. Abdelrahman, "What is the cost of weak determinism?" Proc. of the Int'l Conference on Parallel Architectures and Compilation Techniques (PACT), Edmonton, AB, Canada, August 2014.
GTC14 D. Han and T.S. Abdelrahman, "GPU Performance Auto-Tuning Using Machine Learning, poster presentation, GPU Technology Conference, San Jose, March 2014. (Received Best Poster Award).
ICPP13 M.C. Delorme, T.S. Abdelrahman and C. Zhao, "Parallel radix sort on the AMD Fusion accelerated processing unit," Proc. of Int'l Conference on Parallel Processing (ICPP), pp. 339-348, Lyon, France, October 2013.
FPL13 D. Capalija and T.S. Abdelrahman, "A high-performance overlay architecture for pipelined execution of dataflow graphs," Proc. of the Int'l Conference on Field Programmable Logic and Applications (FPL), Porto, Portugal, September 2013.
GPGPU13 T.D. Han and T.S. Abdelrahman, "Reducing divergence in GPGPU programs with loop merging," Proc. of Sixth Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), Houston, TX, March 2013.
TPDS13 D. Capaljia and T.S. Abdelrahman, "Microarchitecture of a coarse-grain out-of-order superscalar processor," IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 2, pp. 392-405, February 2013.
TPDS12 U. Aydonat and T.S. Abdelrahman, "Relaxed concurrency control in software transactional memory," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 7, pp. 1312-1325, July 2012.
JCSE12 B. Bradel and T.S. Abdelrahman, "Inlining with traces in Java programs," Int'l Journal Computer Systems Science and Engineering, vol. 27, no. 4, July 2012.
CGO12 I. Matosevic and T.S. Abdelrahman, "Efficient bottom-up heap analysis for symbolic path-based data access summaries," Proc. of Code Generation and Optimization (CGO), San Jose, CA, March 2012.
HPCA12 C. Segulja and T.S. Abdelrahman, "Architectural support for synchronization-free deterministic parallel programming," Proc. of High-Performance Computer Architecture (HPCA), pp. 337-348, New Orleans, LA, February 2012.
JEC11 U. Aydonat and T.S. Abdelrahman, "Parallelization of multimedia applications on the Multi-Level Computing Architecture," Journal of Embedded Computing, vol. 4, no. 3, pp. 187-106, October 2011.
FCCM11 D. Capalija and T.S. Abdelrahman, "Towards synthesis-free JIT compilation to commodity FPGAs," Proc. of IEEE Int'l Symposium on Field-Programmable Custom Computing Machines (FCCM), Salt Lake City, UT, pp. 202-205, May 2011.
GPGPU11 T.D. Han and T.S. Abdelrahman, "Reducing branch divergence in GPU programs," Proc. of fourth Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), Newport Beach, CA, March 2011.
TPDS11 T.D. Han and T.S. Abdelrahman, "hiCUDA: high-level GPGPU programming," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 1, pp. 78-90, January 2011.
MICRO10 U. Aydonat and T.S. Abdelrahman, "Hardware support for relaxed concurrency control in transactional memory," Proc. of the Int'l Symposium on Microarchitecture (MICRO), pp. 15-26, December 2010.
FPT09 D. Capalija and T.S. Abdelrahman, "An Architecture for Exploiting Coarse-Grain Parallelism on FPGAs," Proc. of IEEE Int'l Conference on Field-Programmable Technology (FPT), Sydney, Australia, pp. 285-291, December 2009.
PPPJ09 B. Bradel and T.S. Abdelrahman, "The use of hardware transactional memory for the trace-based parallelization of recursive Java programs," Proc. of Int'l Conference on Principles and Practices of Programming in Java (PPPJ), pp. 101-110, Calgary, AB, August 2009.
SCP09 B. Bradel and T.S. Abdelrahman, "A study of potential parallelism among traces in Java programs," Science of Computer Programming, vol. 74, no. 5-6, pp. 296-313, March 2009.
GPGPU09 T.D. Han and T.S. Abdelrahman, "hiCUDA: a high-level directive-based language for GPU programming," Proc. of second Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), pp. 52-61, Washington, D.C., March 2009.
TRANSACT09 U. Aydonat and T.S. Abdelrahman, "Hardware support for serializable transactions: a study of feasibility and performance," Proc. of the fourth ACM Workshop on Transactional Computing (TRANSACT), Raleigh, NC, February 2009.
Expand/Collapseshow older
TRANSACT08 U. Aydonat and T.S. Abdelrahman, "Serializability of transactions in software transactional memory," Proc. of the third ACM Workshop on Transactional Computing (TRANSACT), Salt Lake City, UT, February 2008.
PDCAS07 K. Stewart and T.S. Abdelrahman, "Automatic task generation for the Multi-Level Computing Architecture," Proc. of the Int'l Conference on Parallel and Distributed Computing and Systems (PDCAS), pp. 250-259, Boston, MA, November 2007.
ICPP07 B. Bradel and T.S. Abdelrahman, "Trace-based automatic parallelization of Java programs," Proc. of Int'l Conference on Parallel Processing (ICPP), pp. 26-37, XiAn, China, September 2007.
PPPJ07 B. Bradel and T.S. Abdelrahman, "The potential of trace-level parallelism in Java programs," Proc. of Int'l Conference on Principles and Practices of Programming in Java (PPPJ), pp. 167-174, Lisbon, Portugal, September 2007.
PDCAS06 U. Aydonat and T.S. Abdelrahman, "Parallelization of multimedia applications on the Multi-Level Computing Architecture," Proc. of the Int'l Conference on Parallel and Distributed Computing and Systems (PDCAS), pp. 438-447, Dallas, TX, November 2006. (Received Best Paper award).
ESTIMEDIA06 A. Abdelkhalek and T.S. Abdelrahman, "Locality management using multiple SPMs on the Multi-Level Computing Architecture," Proc. of the 4th IEEE/ACM Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia), pp. 67-72, Seoul, Korea, October 2006.
NEWCAS06 T.S. Abdelrahman, A. Abdelkhalek, U. Aydonat, D. Capalija, D. Han, I. Matosevic, K. Stewart, F. Karim and A. Mellan, "The MLCA: A solution paradigm for parallel programmable SoCs," (invited paper), Proc. of the IEEE Northeast Workshop on Circuits and Systems (IEEE-NEWCAS), pp. 253-256, Gatineau, Canada, June 2006.
SCOPES05 I. Matosevic, T.S. Abdelrahman, F. Karim and A. Mellan, "Power optimizations for the MLCA using dynamic voltage scaling," Proc. of the Int'l Workshop on Software and Compilers for Embedded Systems (SCOPES), pp. 109-123, Dallas, TX, September 2005.
PDCAS04 C. Cavanna, T.S. Abdelrahman, A. Bilas and P. Jamieson, "Jupiter/SVM: a JVM-based single system image for clusters of workstations," Proc. of the Int'l Conference on Parallel and Distributed Computing and Systems (PDCAS), pp. 121-129, Cambridge, MA, November 2004. (Nominated for Best Paper award).
MICRO04 F. Karim, A. Mellan, A. Nguyen, U. Aydonat and T.S. Abdelrahman, "A multi-level computing architecture for multimedia applications," IEEE Micro, vol. 24, no. 3, pp. 55-66, May-June 2004. Also appears in ST Journal of Research, vol. 1, no. 2, pp. 4-16, September 2004.
LCPC04 B. Bradel and T.S. Abdelrahman, "The use of traces for inlining in Java programs," Proc. of the Seventeenth Int'l Workshop on Languages and Compilers for Parallel Computers (LCPC), pp. 179-193, West Lafayette, IN, September 2004.
IVME04 B. Vitale and T.S. Abdelrahman, "Catenation and specialization for Tcl virtual machine performance," Proc. of the Workshop on Interpreters, Virtual Machines and Emulators (IVME), pp. 42-50, Washington, DC, June 2004.
JSC04 B. Chan and T.S. Abdelrahman, "Run-time support for the automatic parallelization of Java programs," Journal of Supercomputing, vol. 28, no. 1, pp. 91-117, April 2004.
SPE04 P. Doyle, C. Cavanna and T.S. Abdelrahman, "The design and implementation of a flexible and extensible Java Virtual Machine," Software: Practice and Experience, vol. 34, no. 3, pp. 287-313, March 2004.
JCSE04 T.S. Abdelrahman and R. Sawaya, "Improving the structure of loop nests in scientific programs," Int'l Journal Computer Systems Science and Engineering, vol. 19, no. 1, pp. 11-25, January 2004.
WASP03 F. Karim, A. Mellan, U. Aydonat, T.S. Abdelrahman, B. Stramm and A. Nguyen, "The Hyperprocessor: a template system-on-chip architecture for embedded multimedia applications," Proc. of the Workshop on Application Specific Processors, pp. 66-73, San Diego, CA, December 2003.
JVM02 P. Doyle and T.S. Abdelrahman, "A modular and extensible JVM infrastructure," Proc. of the USENIX Java Virtual Machine Research and Technology Symposium, pp. 65-78, San Francisco, CA, August 2002.
PDCAS01a B. Chan and T.S. Abdelrahman, "Run-time support for the automatic parallelization of Java programs," Proc. of Int'l Conference on Parallel and Distributed Computing and Systems, pp. 113-120, Anaheim, CA, August 2001. (Received Best Paper award).
PSCAS01b M. Soukup and T.S. Abdelrahman, "A source-to-source OpenMp compiler," Proc. of Int'l Conference on Parallel and Distributed Computing and Systems, pp. 106-112, Anaheim, CA, August 2001.
JPHC01a P. Doyle and T.S. Abdelrahman, "Jupiter: a modular and extensible JVM," Proc. of the Third Annual Workshop on Java for High Performance Computing, Sorrento, Italy, pp. 37-48, June 2001.
JHPC01b N. Brewster and T.S. Abdelrahman, "A compiler infrastructure for high-performance Java research," Proc. of the HPCN Workshop on Java in High-Performance Computing, Amsterdam, The Netherlands, pp. 675-684, June 2001.
TPDS01 N. Manjikian and T.S. Abdelrahman, "Exploiting wavefront parallelism on large-scale shared-memory multiprocessors," IEEE Trans. on Parallel and Distributed Systems, vol. 12, no. 3, pp. 259-271, March 2001.
PDCAS00 T.S. Abdelrahman and R. Sawaya, "Increasing perfect nests in scientific programs," Proc. of Int'l Conference on Parallel and Distributed Computing and Systems, pp. 279-285, Las Vegas, NV, November 2000.
ICPP00 R. Grindley, T.S. Abdelrahman, S. Brown, S. Caranci, D. DeVries, B. Gamsa, A. Grbic, M. Gusat, R. Ho, O. Krieger, G. Lemieux, K. Loveless, N. Manjikian, P. McHardy, S. Srbljic, M. Stumm, Z. Vranesic and Z. Zilic, "The NUMAchine Multiprocessor," Proc. of Int'l Conference on Parallel Processing (ICPP), pp. 487-496, Toronto, Ontario, August 2000.
JPDCP99 T.S. Abdelrahman and G. Liu, "Overlap of computation and communications on shared-memory networks-of-workstations," Journal of Parallel and Distributed Computing Practices, vol. 2, no. 2, pp. 145-153, June 1999.
JSC98 T.S. Abdelrahman and T.N. Wong, "Compiler support for data distribution on NUMA multiprocessors," Journal of Supercomputing, vol. 12, no. 4, pp. 349-371, October 1998.
PDPTA98 G. Liu and T.S. Abdelrahman, "Computation-communication overlap on network-of-workstation multiprocessors," Proc. of the Int'l Conference on Parallel and Distributed Processing Techniques and Applications, pp. 1635-1642, Las Vegas, NV, July 1998.
ICPP97 S. Tandri and T.S. Abdelrahman, "Automatic data and computation partitioning on scalable shared memory multiprocessors," Proc. of the Int'l Conference on Parallel Processing (ICPP), pp. 64-73, Bloomingdale, IL, August 1997.
TPDS97 N. Manjikian and T.S. Abdelrahman, "Fusion of loops for parallelism and locality," IEEE Trans. on Parallel and Distributed Systems, vol. 8, no. 2, pp. 193-209, February 1997.
JSC96 T.S. Abdelrahman, "Latency hiding on COMA multiprocessors," Journal of Supercomputing, vol. 10, no. 3, pp. 225-242, November 1996.
LCPC96 S. Tandri and T.S. Abdelrahman, "Automatic data and computation partitioning on scalable shared memory multiprocessors (extended abstract)," Proc. of the Ninth Int'l Workshop on Languages and Compilers for Parallel Computers, pp. 600-602, San Jose, CA, August 1996.
ICPP96 N. Manjikian and T.S. Abdelrahman, "Scheduling of wavefront parallelism on scalable shared memory multiprocessors," Proc. of the Int'l Conference on Parallel Processing (ICPP), pp. III-122-III-131, Bloomingdale, IL, August 1996.
PDPTA96a T.S. Abdelrahman and S. Huynh, "Exploiting Task-Level Parallelism Using pTask," Proc. of the Int'l Conference on Parallel and Distributed Processing Techniques and Applications, pp. 252-263, Sunnyvale, CA, August 1996.
PDPTA95a T.S. Abdelrahman, "Latency hiding on COMA multiprocessors," Proc. of the Int'l Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 363-372, Athens, GA, November 1995. (Received Best Paper award).
PDPTA95b S. Tandri and T.S. Abdelrahman, "Computation and data partitioning on scalable shared memory multiprocessors," Proc. of the Int'l Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 41-50, Athens, GA, November 1995.
PDCS95 N. Manjikian and T.S. Abdelrahman, "Array data layout for the reduction of cache conflicts," Proc. of the 8th Int'l Conference on Parallel and Distributed Computing Systems, pp. 111-118, Orlando, FL, September 1995.
ICPP95 N. Manjikian and T.S. Abdelrahman, "Fusion of loops for parallelism and locality," Proc. of the Int'l Conference on Parallel Processing (ICPP), pp. II-19-II-28, Oconomowoc, WI, August 1995.
PDCAS94 T.S. Abdelrahman, "Performance of parallel branch and bound algorithms on the KSR1 multiprocessor," Proc. of the Int'l Conference on Parallel and Distributed Computing and System, pp. 52-58, Washington, D.C., October 1994.
SHPCC94 T.S. Abdelrahman and T.N. Wong, "Distributed array data management on NUMA multiprocessors," Proc. of the Scalable High-Performance Computing Conference, pp. 550-559, Knoxville, TN, May 1994.
CSS92 V. Kommu, I. Pomerantz and T.S. Abdelrahman, "A genetic learning strategy in constrained search spaces," Proc. of the Twenty-Fifth Hawaii Int'l Conference on System Sciences, pp. 26-35, Kailua-Kona, HI, January 1992.
HYPER88 T.S. Abdelrahman and T.N. Mudge, "Parallel branch and bound algorithms on hypercube multiprocessors," Proc. of the 3rd Conference on Hypercube Concurrent Computers and Applications, pp. 1492-1499, Pasadena, CA, January 1988.
JSA87 W.R. Martin, T.C. Wan, T.S. Abdelrahman and T.N. Mudge, "Monte Carlo photon transport on shared memory and distributed memory parallel processors," Int'l Journal of Supercomputing Applications, vol. 1, no. 2, pp. 57-74, September 1987.
JPDC87 T.N. Mudge and T.S. Abdelrahman, "Vision algorithms for hypercube machines," Journal of Parallel and Distributed Computing, vol. 4, no. 2, pp. 79-94, April 1987.
HYPER86a T.N. Mudge, G.D. Buzzard and T.S. Abdelrahman, "A high performance operating system for the NCUBE," Proc. of the 2nd Conference on Hypercube Multiprocessors, pp. 90-99, Knoxville, TN, October 1986.
HYPER86b W.R. Martin, T.C. Wan, T.N. Mudge and T.S. Abdelrahman, "Monte Carlo photon transport on the NCUBE," Proc. of the 2nd Conference on Hypercube Multiprocessors, pp. 454-463, Knoxville, TN, October 1986.
CAPP83 T.N. Mudge and T.S. Abdelrahman, "A case study of a program for the recognition of occluded parts," Proc. Of the IEEE Workshop on Computer Architecture for Pattern Analysis and Image Database Management, pp. 50-60, Pasadena, CA, October 1983.
[ICPP83] T.N. Mudge and T.S. Abdelrahman, "Efficiency of feature dependent algorithms for the parallel processing of images," Proc. of the Int'l Conference on Parallel Processing (ICPP), pp. 369-373, Blair, MI, August 1983.