Nemanja G.

Nemanja G.

Senior Software Engineer

Nis, Serbia
Hire Nemanja G. Hire Nemanja G. Hire Nemanja G.

About Me

Nemanja is a software engineer with over 11 years of industry experience in C++, CUDA, computer vision, machine learning, performance optimizations, and more. He is passionate about programming professionally and privately and strives to write top quality and top performance code.

C Computer Vision Visual Studio C++17 NVIDIA CUDA C++11 Machine Learning Algorithms Deep Learning Artificial Intelligence (AI) Git Windows OpenGL C Multithreading

Work history

MotionDSP
R&D Lead Engineer
2013 - 2019 (6 years)
Remote
  • Wrote automatic optimizer of data-intensive algorithms in C++. Code automatically generates multi-core and vector optimizations.

  • Worked closely with other researchers on design, implementation, and optimization of computer vision, image processing, video processing, machine learning, AI, and deep learning algorithms.

  • Envisioned an algorithm for creating mosaic images from surveillance footage or sets of aerial images. Guided the team to the successful implementation of the algorithm.

  • Successfully modernized C++ codebase by porting it to C++11, C++14, and C++17. Made code more secure and less prone to errors and memory leaks.

  • Established coding style guidelines and introduced good programming practices to the team, including pair programming, code reviews, and supported teamwork.

  • Led 3D GIS project development - similar to Google Earth with real-time video stream rendering on top of the 3D globe.

  • Managed the R&D division of the company. Monitored all major R&D projects, reported on progress, and provided technical guidance to keep on track.

MotionDSP
R&D Engineer - Senior Software Engineer
2008 - 2013 (5 years)
Remote
  • Improved the current multi-frame super-resolution algorithm by making it resilient to ghosting effects present at the time.

  • Ported most of the company's video processing algorithms to CUDA (GPGPU), including super-resolution, de-blurring, contrast enhancement, frame-rate adjustment, and more.

  • Optimized all the above-mentioned algorithms and enabled real-time performance.

  • Created a testing framework for video processing algorithms to ensure successful regression testing under change.

  • Ported an extremely challenging MSER feature detector to CUDA (filed a patent).

  • Created an RAII-based GPU memory management system which hides memory allocation latency and enables even more performance.

Windows OpenCL/GPU GitCaching Parallel Programming MultithreadingPerformance OptimizationVisual Studio Performance Optimization Profiling Memory ManagementUnit TestingTest-driven development (TDD)Low LatencyReal-time Systems Image ProcessingVideo ProcessingComputer VisionGPGPUOpenCL NVIDIA CUDA C++
Deutsche Telekom Laboratories
Junior Researcher
2007 - 2008 (1 year)
Remote
  • Researched the problem of real-time human head poses estimation in the field of computer vision (CV).

  • Implemented a novel approach to this problem. Used OpenCV, C++, and Linux.

  • Published paper on the proposed method in the Automatic Face and Gesture Recognition conference.

Faculty of Electronic Enegineering, University of Nis
Software Development Intern
Present (2025 years)
Remote
  • Developed a 3D graphics engine for massive landscape rendering in C++ and OpenGL.

  • The engine was able to render gigabytes of terrain texture data in real time by automatically adjusting level of detail on per frame basis.

  • Optimized engine performance to achieve real-time.

Low LatencyReal-time Systems Performance Optimization Visual Studio 3D Graphics Engines 3D Graphics OpenGLC++

Portfolio

Poker Playing Bot

An autonomous poker-playing program. The program was winning against humans and used Bayesian inference to estimate the opponents' style of play after just a few hands. The program won first place at Annual Computer Poker Competition 2018 in Six-Player No-Limit Texas Hold'em category and second place at Acpc 2017 in Heads-up No-Limit Texas Hold'em category.

Source in C++

A template-based function specification and derivation engine in C++. It is similar to Theano from Python but basic. Users can specify functions in the explicit form:• Symbol x(0);• Symbol y(1);• auto f = (sqrt(sqr(x) * 2.0f + sqr(y)) + 1.0f);They can also evaluate them, get derivatives, and apply them to a data collection, possibly in parallel.

Education

Education
M.Sc. Degree in Computer Science
Nis University
2000 - 2006 (6 years)