I have gotten conflicting information, so thought I would clarify. Does adding a limit to a fetch in LBs improve performance?
In my use case I know there will only be 2 lines for a given fetch (core charge/core return), will adding a limit of 2 on the fetch improve performance or add any benefit?
The reason for conflicting information is likely because the answer is quite nuanced, with most nuance being based on the shape of the dataset.
Limits say âstop/return this query as soon as I find N records.â Generally, the performance of LIMITs is based on whether the query returned because of the limit or because it ran out of records to process.
That is to say: if we assume that your limit is less than the number of records on the table, and that it is less than or equal to the number of records you will get back from the fetch, then it is faster to use a limit in the fetch. The query stops due to hitting a limit.
If we assume the limit is greater than or equal to the number of records on the table, OR that it is greater than the number of records you will get back from the fetch, it is slower to use a limit in the fetch. The query stops due to running through all available records.
In your case, do you find 2 records because only 2 exist? Or do you find 2 records via a filter? If the former, using a LIMIT will not be faster. If the latter, using a LIMIT will be faster.
Remember that filters are applied behind the scenes for details and subtables. If you have 2 records in a detail table, and you fetch those, then assuming those arenât the only 2 detail records that exist across all headers, it will be faster to use a limit; as it will return as soon as it finds the records you are looking for rather than scanning all detail records.
@samuel.sovereign I too have gotten conflicting information in the past. I generally subscribe to what youâve listed above. But Iâve heard something related to the Limit adding unnecessary query parameters being detrimental. Is that a myth?
Nothing specifically comes to mind for âadding unnecessary query parameters.â I will say I neglected mentioning OFFSET, which adds to the general rules above (use LIMIT * OFFSET as your limit value rather than just LIMIT).
It is an extra parameter to bind (edit: add) in the prepared statement, but this should not be detrimental at all to the performance of the query. EDIT: There is no binding, which makes this even less of a concern
I am inclined to say that statement, âthe Limit adding unnecessary query parameters being detrimentalâ, is a myth. But if you see, hear, or experience, differently, please let us know and we can look at the specific case. As I mentioned, query performance is quite nuanced and thereâs a lot of factors that can affect it.
I agree with @Aaron.C . We were first told to use them all the time if we only expect a certain amount of recordsto be retrieved. Then we were told not to use them as it does not help with performance since it is another parameter. We do not have a specific use case but have been using NOT using them as a rule of thumb. Perhaps @paul.villanueva can speak more to it since he has been more involved on performance enhancements.
If you know your fetch is only going to return only 1 record, adding the limit 1 does not improve performance. For instance, if you are fetching by nwid there is not much difference whether your add limit 1 or not.
Thatâs not strictly true, partially due to what @samuel.sovereign was hinting at above - adding the limit tells the server to stop looking after itâs found n records.
The most common query by nwId with a limit of 1 says to the server âlook for a record in this table with this nwId. Stop when youâve found 1 record.â
This is more performant than not including a limit because that statement says âlook for all records in this table with this nwId.â Since the server does not implicitly know that nwId is unique, it will keep looking for more records even after finding the 1 record it will find. The performance improvement is likely insignificant, but it is there and will become more apparent at scale.
There are lots of nuances to these cases and the answer to the questions is very frequently situational. Typically, if you expect the query to return 1 record, you should always include a limit for two reasons:
Performance (sometimes this is an improvement, sometimes it isnât)
Readability of code (at a glance, âLimit 1â tells you what the developer was expecting when they wrote that code and makes understanding the code or debugging simpler)
@lauren Yeah I can understand the frustration, which is why I avoided a straight yes/no in my answer. Thereâs just so much it can depend on and Iâd be cautious to say âalways/never use a limit.â
The only time it is equally performant to use a LIMIT as to not in the LIMIT 1 case is when you exactly match the Unique Index with your filters, AND Postgres decides to use the Unique Index, which is not a given (it uses whatever it decides is best at runtime based on data distribution). In this case, Postgres knows it can stop and return the first record it finds.
I would generally agree with Rossâs advice to use a limit in this case, for the same reasons he listed, and that it fits into Use Case 1 I listed in my original answer: The limit is LESS THAN the number of records in the table AND less than or EQUAL to the number of records found from the fetch.
Again this is nuanced. For example if you add a sort order/ORDER BY to the query, it is possible (nuanced) that a LIMIT degrades performance on an otherwise well-performing query since it might now have to scan the whole dataset to get the correct order.
I asked ChatGPT about this - ChatGPT - When to use LIMIT. See in particular the follow up question on how LIMIT interacts with indexes.
Aligns with most of the above comments. I personally would not be recommending LIMITs when asked about this situation. The marginal performance benefit it provides, to me, is not worth the risk of the user defining a LIMIT 2 because they believe there will only ever be 2, then surprise surprise, somehow thereâs actually 3 and they canât figure out whatâs going wrong.
With respect to the original question, I am in agreement with @ian.p . Unless you know for certain that there will be exactly 2 records, using specific fetch parameters but not a limit is probably the way to go.
Limits do also tend to hide the fact that extraneous / incorrect data exists because you donât unintentionally get back more records than expected.
In my use case I do only have 2 records, so I guess no limit would be faster. Thank you for your detailed response!
Update: I wrote this before reading all of the other comments and now I am confused haha. So I guess I DO want to use a limit if I know for sure there will be 2 or less records? That is what I am gathering. @samuel.sovereign
That answer depends on context, so it isnât a âone answer fits all.â
If your table only has 2 records in it and you are doing a blanket fetch â No Limit
If your table has many records but your fetch expressions will narrow the data set down to exactly 2 records â no limit required, but it can be good to implement one for readability
If your table has many records and your fetch might narrow down the data set but not to only 2 records â sort + limit
If you provide some more context about the table your fetching from, whether you have fetch expressions, how many records should return (or could return), and whether you have a sort, we can better answer whether your fetch should or should not include a limit (or if it doesnât really matter).
Okay I am thinking this is a really doesnât matter scenario, but to confirm basically we will have a sales order records with 1-2 other SO details fetched through the nwHeaderID/parentPartSequenceNumber on each detail. @ross
I would tend to say âNo Limitâ if youâre really looking for all of the associated details by that part number - if those systematically can only be 2 in number, the nwHeaderId and other filter criteria should guarantee you get both records back