Find what got gced between two time points (like the selection) when having full memory tracking.

Created June 19, 2023 23:18

Hi!

I know we can diff two snapshots, and find the difference of what got created, destroyed and survived.

However, this does pretty much only works with objects listed in the snapshots, not those that got created and destroyed after the first snapshot and before the second snapshot.

Also, I know that we can find out what are all the allocations done during a certain time range (when having full memory allocation tracking).

But is there a way to find out what got garbage collected during a time range?

Something like making a diff between a Snapshot and all memory allocations before that snapshot.

My scenario is that I'm trying to figure out, during a specific operation in my application, what got allocated, but got destroyed. For now, with full memory tracking, I can know all the allocations done in total, but I cannot differentiate those that survived between those that got gced. Feels like this would be possible by doing a snapshot before (X) and after (Y) the operation, and basically doing Y Minus Allocations (if that makes sense), but could not figure out how to do this.

Thanks!

Francois

4 comments

Anna Guseva

Created June 20, 2023 15:56

Hello,

Memory allocations view shows the objects allocated on selected timeframe. This view is created to solve some performance-related issues, e.g., to figure out why GC pressure is too high.

According to your description, you would like to get information about the objects which were allocated AND collected during the selected timeframe. Currently, dotMemory can't present this. You're welcome to vote for the corresponding feature request in our tracker:
https://youtrack.jetbrains.com/issue/DMRY-10389/Show-the-count-and-the-size-of-the-objects-allocated-and-collected-during-the-selected-timeframe

Could you please describe what issue you are experiencing in your application and how you will use information about collected objects to solve it? This would help us better understand your scenario.

Francois Cournoyer

Created June 20, 2023 18:35

In my case, the application is loading thousands of XML files, which will be creating about 5 millions objects to represent the file content, which is expected.

However, while performance profiling during the load time, sometimes up to 60% of the perf time is spent on GC.

After doing a memory profiling, we noticed that more than 100 Millions object allocated during loading. Profiling with full memory allocations enabled us to discard the low hanging fruits, which just by the signature, we know that they are temporary objects. For example, LINQ operations in methods called millions of time is not the best idea. For example, we could see things like "List+Enumerator<MyObject>", where we could see them coming from the back traces similar to "Enumerable+<CastIterator>.MoveNext()". Those were easy and quick to fix. Just by the signature and call stack, we know that they got allocated and collected (and by the fact that we can't find a single instance remaining in a snapshot, so no memory leaks).

Now, we are trying to evaluate and differentiate which objects are allocated and collected during this load time, and which ones are persisting. For example, since it's coming from XML files, there's a lot of string allocations. Which ones are still being used, and which ones were temporary? And between different section of loading code? This applies to a lot of other objects, where we're not too sure if they survived or got collected. This would hopefully give me more hints, to also see if some objects got promoted up to Gen2, or got collected earlier.

In the example of the LINQ operation, just the signature tells us that those objects were temporary. But sometimes the signature is not obvious (like an array or a list). If we would know which objects got collected during a specific time frame, we could ask ourselves "Should we even be allocating those objects in the first place?" Lists that got allocated and are still being used after the loading (and that are not memory leaks) are a good thing, but maybe I'm not expecting lists to be collected during a section of code, and that would be helpful to give me a hint that those collected lists shouldn't be allocated in the first place.

In the end, the memory traffic is a big issue as we have a high number of objects. We need to allocate less. If we need to allocate, we need to concentrate on either keeping them long term (Gen2), or if they are temporary, we want them to stay in Gen0.

Anna Guseva

Created June 21, 2023 16:05

Thank you for the information.

Actually, you can try this feature in the old dotMemory version (2021.1): Other Versions - dotMemory (jetbrains.com)

Analyze Memory Traffic | dotMemory (jetbrains.com)

We have replaced the old concept of memory traffic with the new one, but the implementation of some of the old features has been delayed.

Note that the "old" memory traffic view is only available between snapshots or for one snapshot since the start of the profiling session.

Francois Cournoyer

Created June 23, 2023 17:17

Oh thanks a lot for this!

This old feature is pretty much what I was searching for. It's been really interesting to sort by the "collected objects" column! This feature needs to come back in the newer versions!

Please sign in to leave a comment.