Find Page Breaks Faster

  • I need to find row numbers of the horizontal page breaks on a sheet quickly. The only method I have found to do this is by using the HPageBreaks property, which is painfully slow.


    Is there another way of doing this that's faster? Or some other manner of using HPageBreaks that improves its speed?


    The scenario is that I have a macro which generates a list of items where each item is two rows long. If an item intersects a page break--one row is on one page and the 2nd row is on the next page--I want to insert a row so that the entire item is on the next page.


    I've been searching everywhere and can't find any other way. Any help would be greatly appreciated.

  • Re: Find Page Breaks Faster?


    You really need to post the code you are using. I suspect switching of the calc and the screen would speed things up.


    You could try jumping to the page breaks like :-



    HTH
    Carl

  • Re: Find Page Breaks Faster?


    That's similar to the code I had. But I tried your code in a fresh workbook, adding the code for turning off screen updating and calculation (and tweaked the row offset for insertting a new row).



    It took about 12 seconds per 100 rows and each cell in the first column just had two words. That's painfully slow in the application I'm using it in.


    If I have to, I'll use it, but I'm hoping there's another way.....


    Thanks for replying, though! :)

  • Re: Find Page Breaks Faster?


    Excel needs to chat with the printer driver when you're doing page setup, and I expect this is also true in finding pagebreaks. If the printer is not online, it takes much longer.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster?


    It does take time because as shg points out as you run through the pagebreaks collection it starts to talk to the spooler. Your case seems to be too slow though. I tested with 20000 rows and it took 35 seconds (printer off or on)[XP, Office 2003, Core 2 2.13 @ 50%). I have seen several issues with printer drivers slowing Excel to a crawl so maybe worth searching on that or reinstalling.


    As you are waiting have a look in the task manager and see if the spooler (spoolsv.exe) is eating resources.

  • Re: Find Page Breaks Faster?


    You can use an XLM4 macro function to return an array of the rows immediately below where the horizontal page breaks are. Unfortunately, despite repeated efforts, I couln't get an array returned in code via Application.ExecuteExcel4Macro() macro - it only seemed to want to return a single value representing the first page break. It did however work for me as a defined name and referencing that in worksheet cells:


    1. Go Insert>Name>Define and call your name something like HPBreaks
    2. In refers to type:


    =GET.DOCUMENT(64)


    3. Click Add
    4. Back in the worksheet select some empty column somewhere (lets say Z1) and type in the formula:


    =INDEX(HPBreaks,ROW())


    and copy down as far as required.


    This will (assuming you have Automatic Calculation enabled) show you on what row the breaks are and will dynamically update. You can refer to this list from code as required ie to incorporate into your existing macro for example.


    Hope this helps!


    Richard

  • Re: Find Page Breaks Faster?


    "XLM4 macro function" ! Nice call Richard. You have obviously been using Excel for a while.


    Unfortunately when I tried it out on the 20K rows it took 20 secs to calculate with spoolsv.exe still active. Inserting rows takes time too. Then I tried adding the ranges together with a Union statement before inserting and it appears to have improved things (Union did not make much difference with the collection method). Now about 25secs.

  • Re: Find Page Breaks Faster?


    That's pretty cool, Richard, thanks.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster?


    Quote


    I wonder if it would be faster compared to accessing the VBA PageBreaks collection if a network printer was involved?


    You mean maybe the XLM4 function talks to the spooler but maybe not all the way to the printer so if its a network printer the difference maybe more pronounced ? Possible yep.



    Quote


    which is pretty slick I think.


    Me too. That evaluate function is great. There is a post (Aaron's I think) somewhere that lists all sorts that you can do with it.


    It looks like what ever method we use Excel is going to have a very inefficient conversation with the spooler program. Digging around in the detail of what exactly they are talking about together will take along time. If this was my worksheet I think I would side step the issue and take full control of the printing, page by page, with VBA.

  • Re: Find Page Breaks Faster?


    Another way would be to take control and set your own page breaks: use range.height to estimate where the page breaks will occur, and insert a manual break prior at a preferred boundary.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster?


    Thanks for all of your replies! Looks like I have some options to explore. Incidentally, it is a network printer that the computer is hooked up to, so that may be contributing to the delay.


    I'll dig around and post my results.


    Thanks again.

  • Re: Find Page Breaks Faster?


    Try this.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster


    Set page breaks that break such that groups of lines stay together -- analogous to 'keep with next' in Word.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster


    Thanks for the code, shg!


    I created an acceptable solution using parsnip's code, but I'm trying your approach to see if it's faster. The only problem I've come across is the Range("Print_Area").


    For some reason, that range is not created when the PageBreaks are reset. Would having ScreenUpdating turned off or Calculation set to Manual have any affect on that?


    Thanks again!

  • Re: Find Page Breaks Faster


    Print_Area is the name of the range that gets created when you select the area you want printed and do File > Print Area > Set Print Area.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • Re: Find Page Breaks Faster


    Thanks, shg, but I know what "Print_Area" is. I thought maybe it was automatically created when the PageBreaks were reset. I guess that'll have to be created in the procedure.


    Thanks again! :)

  • Re: Find Page Breaks Faster


    Quote

    For some reason, that range is not created when the PageBreaks are reset.


    No, it's not -- nor is it affected by page breaks. You can set it manually once, or define it as a dynamic range (my choice) if the data shrinks and expands, or set it in code if you prefer. All roads lead to Rome.

    [SIZE=1]Entia non sunt multiplicanda sine necessitate.[/SIZE]

  • I realize this is an ancient thread, but I have recently run into this issue and couldn't find a reliable work-around on the net. Plus, it's still one of the top results in Google, so I assume the thread must still be quite relevant.


    Anyway, I believe I have found a solution. It requires a few steps but once set up, works a charm - I've tested it against 8500 rows producing a 205 pages-long document in Excel versions 1806 and 1906 and I couldn't tell if my network printer took any longer printing it with than without. In contrast, using any of the solutions I could find (including using GET.DOCUMENT(64) alone) resulted in extremely long calculation times.


    The secret lies in taking RichardSchollar's solution and tweaking it so that GET.DOCUMENT(64) only gets evaluated once. So:



    1. Create a named range scoped to the sheet you want to print (it doesn't have to be, but you will need to change the object in step 4 if you choose to scope to the workbook) called vbValue_PageBreaksDisplayed. "Refers to" is irrelevant at this point as it will be populated from VBA.


    2. Create a named range - preferably scoped to the sheet, too - called Value_PageBreakRows and in the "Refers to" field paste:

    Code
    1. =IF(vbValue_PageBreaksDisplayed,GET.DOCUMENT(64))



    3. In VBA's public (default) module create this function:

    Code
    1. Public Function JoinArray(ByVal SourceArray As Variant, Optional ByVal Delimiter As Variant) As String
    2. JoinArray = Join(SourceArray, Delimiter)
    3. End Function



    4. And also this sub:



    5. (Optional: to ensure that any existing page break markers are wiped when activating a worksheet) In VBA under the header of the sheet you want to print (for example "Sheet1 (MySheetName)") paste this (or add SetPageBreaksDisplayed to an existing Worksheet_Activate event):

    Code
    1. Private Sub Worksheet_Activate()
    2. SetPageBreaksDisplayed
    3. End Sub



    6. (Optional: to ensure that any existing page break markers are wiped from the sheet the workbook opens on) In VBA under ThisWorkbook paste this:

    Code
    1. Private Sub Workbook_Open()
    2. SetPageBreaksDisplayed
    3. End Sub



    7. And, also under ThisWorkbook, this:

    Code
    1. Private Sub Workbook_BeforePrint(Cancel As Boolean)
    2. SetPageBreaksDisplayed False
    3. SetPageBreaksDisplayed
    4. End Sub



    8. Now we have all the required elements in place to make the magic happen. In a cell somewhere on your worksheet (this is very important, the result can't be stored in a named range - otherwise everything will slow down to a crawl) write:

    Code
    1. =IF(vbValue_PageBreaksDisplayed,JoinArray(Value_PageBreakRows,","))



    9. Name the cell you've put the above formula in Value_PageBreakRowsString.


    10. And finally, fill a column against the rows where you need to determine page breaks with this formula (it will return TRUE for rows below a page break and FALSE for everything else):

    Code
    1. =IF(vbValue_PageBreaksDisplayed,NOT(ISERROR(FIND(","&ROW()&",",","&Value_PageBreakRowsString&","))))




    If you're wondering about why I've used the SetPageBreaksDisplayed function to populate the vbValue_PageBreaksDisplayed named range and then used that as argument for all the IFs, one reason for that is to ensure that page breaks are established only when they're relevant (i.e. when you go to print the document). More importantly however, it's the main factor in coercing GET.DOCUMENT(64) to re-evaluate - that's why SetPageBreaksDisplayed isn't a sub but a function able to write a given value to the named range (it could've been broken down into two separate functions or even it's own class) and why it needs to get called twice in the Workbook_BeforePrint event. This is only relevant when you've printed the sheet once already and could print it again with page breaks at different rows (this could be affected by using different filters/slicers, different printing orientation or paper size, etc.). If the value of vbValue_PageBreaksDisplayed doesn't change between printings, GET.DOCUMENT(64) will never get re-evaluated - if it was TRUE because you printed the sheet already, it needs to become FALSE before going TRUE again to force re-evaluation.



    I'm not sure if all these steps are necessary - things might work differently depending on your Excel version or you simply might not need all of this for your purposes - but I hope this guide at least helps in pointing you in the right direction.

  • You really need to post the code you are using. I suspect switching of the calc and the screen would speed things up.


    Calculations or screen updating aren't a problem here, it's the communication between Excel and the printer that's required.
    The code that I was comparing to was RichardSchollar's solution from earlier in this thread, as it looked like the most promising work-around that I could find, but it really doesn't matter (all implementations that I could find suffer from the same problem), which is why I didn't bother posting it again:



    I tested against the above with the only difference of having to replace INDEX(HPBreaks,ROW()) with MATCH(ROW(),HPBreaks,0) to find page breaks (INDEX won't work, unless I misunderstood Richard's intent).


    Just like any other proposal that Google spat out, it still requires Excel to communicate with the printer for every row that you want to test for having a page break (even though it seems like it shouldn't). Switching calculations and screen updating off doesn't do anything. You still need to calculate your page breaks at some point at least once. In a document with 5000 rows that means doing it 5000 times, so doing it even once is going to be one time too many. Never mind the "where" and "how" of it.


    My solution, while not without its shortcomings (for example, relying on Workbook_BeforePrint event means that page breaks won't get established when exporting to .pdf) ensures Excel does it only once, regardless of how many rows your spreadsheet has.