Puzzling minor issues with NextCloud VM installations

silversolver · May 8, 2021, 12:02am

I have been using Nextcloud for a little over 6 months now. I love it. I have been using it for myself since last September and I just deployed it for a client of mine as well. In both cases I used the VM resources, although in the client’s case I actually used the VM script from HERE to deploy it on a fresh install of Ubuntu Server 20.04.2 running on an old machine that would otherwise have been retired. It was one of those things that no one said would work that I figured would work, and it did. However, I wouldn’t recommend doing the installation on a physical server this way unless you are planning to dedicate the device to NC, but we were, so it was perfect. For my own, I am using the prebuilt VM on a Windows host. I started with the NC19 VM, then recently migrated my data to a fresh NC21 VM, which was a nice upgrade. (I dumped the data out of the old one to a Windows share via CIFS, then imported it into the new one the same way.)

That is all background information. The issues I have are only annoying, and nothing that interferes with usability, but I would like to understand why they are happening and resolve them. 2 of them I believe are connected.

Connecting insecurely gives the landing page with “Thank you for downloading the Nextcloud VM, you made a good choice! If you see this page, you have run the first setup, and you are now ready to start using Nextcloud on your new server. Congratulations! :)” I would have thought that this would only appear once and then it would simply forward to the secure login page.
Running “sudo -i” to gain root, sometimes because I really need it and sometimes because, yes, I’m too lazy to type sudo repeatedly, triggers the first run script every time I do it. Of course it quits quickly because it realizes it has already been run, but it would be really nice to have “sudo -i” just give me root rather than griefing me with the script every time.
On the most recent deployment, on the physical machine, I attempted to run the automatic backup wizard from menu.sh but it gave me the same error it did when I tried to run it from the setup script, that it can’t be run during the initial setup.

Clearly some things are not getting properly cleared away after the initial setup, and I’d like to know what they are and how to clear them. I could probably dig around for many hours and figure out what they are, but if someone knows and could spare 5 minutes to tell me, I’d be very grateful.

T&M Hansson IT AB are rockstars. The amount of energy they have put into this product and made available to the public free of charge is astonishing. “Pro bono” doesn’t mean “for free,” though that’s how it’s generally used; it means “for good,” or as a contribution to the greater good. These guys have done so much pro bono work that it is a little shocking. Thank you so much!

szaimen · May 8, 2021, 8:22am

Thank you very much for your great feedback!
Pinging @enoch85, too.

Concerning your support points:

Should work after you run the Activate TLS script: sudo bash /var/scripts/menu.sh → choose Server Configuration → Activate TLS
Alternatively, you can use the deSEC script: sudo bash /var/scripts/menu.sh → choose Server Configuration → deSEC → choose to Activate TLS at the end of the script
It seems like you didn’t run the startup script to the end. You can fix this by running:
sudo rm /var/scripts/nextcloud-startup-script.sh
will be fixed with this command, too.

enoch85 · May 8, 2021, 10:45am

@szaimen is right here, the startup script finished non-clean.

Removing the script doesn’t fix the main issue, but workarounds the issue you’re facing.

An even better solution would be to rerun the whole setup again and make sure to follow all the prompts carefully. The last thing that happens is that everything is reset to “normal” mode and the startup script is removed. If that doesn’t happen in a clean way you end up in a kind of a broken state. It works, but I wouldn’t feel comfortable without knowing exactly what went wrong.

silversolver · May 8, 2021, 1:52pm

Thank you for taking the time to help here. However, your information makes this that much more puzzling. I’m guessing you didn’t read my admittedly very long original post in its entirety, as I have experienced this behavior on 3 different installations, 2 of which are still in service. In all of these, I very carefully ran the startup script all the way to the end. Whatever is failing is failing consistently for me, and redeploying both instances currently in service where I have experienced this is not on the radar right now. As I did run the script all the way to the end and everything is working, I do not believe that anything is actually broken beyond that the script failed to delete itself and the trigger that launches it sometimes.

If it helps to clarify what is happening, normal login as ncadmin does not rerun the startup script; only login as root via sudo -i.

I wonder why I have experienced this 3 times and seemingly no one else is experiencing it at all. Could it be that the machines on which I have deployed this are just slow enough that the reboot happens before cleanup is totally finished? That seems unlikely but I’m puzzled as to why this has been so consistent for me.

Am I understanding correctly that the landing page is something I cannot remove unless I activate via Let’s Encrypt? If so I’ll just have to live with it (no big deal) as neither of my use cases need or support anything other than a self-signed certificate.

szaimen · May 8, 2021, 1:59pm

Did you always installed your instances using the install_production_script? If yes, did you also run the 2nd script successfully? So https://github.com/nextcloud/vm/blob/master/nextcloud_install_production.sh and vm/nextcloud-startup-script.sh at master · nextcloud/vm · GitHub?

silversolver · May 8, 2021, 2:08pm

The first two times I did it I used prebuilt VMs where the install_production script had already been run, and only the second remained to run. In the third case, yes, that was the one I used, and it installed, rebooted, and ran the second script as expected. Nothing apparently went wrong with any of it.

szaimen · May 8, 2021, 2:11pm

Did it reboot the VMs again after exiting the 2nd script?

silversolver · May 8, 2021, 2:42pm

I think so. It has been long enough that I don’t remember for certain. If it did not, what would that have broken, and is it something that I could check?

szaimen · May 8, 2021, 2:47pm

Unfortunately, there is no way to find out what went wrong when you don’t remember any obvious errors or that the script didn’t run to the end.
I can only say that the startup script gets removed at the very end of the startup script. And if it wasn’t removed, you didn’t run the startup script to the end and hence something must has gone wrong during it or you canceled it by pressing [CTRL] + [c].
Here is the line that removes the script: vm/nextcloud-startup-script.sh at 04ec101f4b85fed16f231af8d75587447c76bc5a · nextcloud/vm · GitHub

silversolver · May 8, 2021, 3:00pm

Interesting. This is helpful. I guess I never thought about reviewing the script itself. I feel confident that in at least the most recent install I saw line 538, but I don’t remember anything after that. Would there be any harm in creating a truncated version of this script that starts right after that and running it as a cleanup script? Interestingly I don’t believe in any case the set trusted domain script has ever run, as I have always had to set them manually by editing the config. I’m not sure why the script would reliably fail for me right after the update, but it looks like that may be what is happening.

szaimen · May 8, 2021, 3:06pm

If you are sure that line 538 is the last line that you’ve seen, you should be able to create a truncated version of this script starting at this line. But don’t forget to put this into any truncated script: source /var/scripts/fetch_lib.sh

Did you never see this menu and the option to Activate TLS?

github.com

nextcloud/vm/blob/04ec101f4b85fed16f231af8d75587447c76bc5a/menu/server_configuration.sh#L47-L58


      
          "Choose what you want to configure
          $CHECKLIST_GUIDE\n\n$RUN_LATER_GUIDE" "$WT_HEIGHT" "$WT_WIDTH" 4 \
          "deSEC" "(Automatically set up a dedyn.io domain, together with DDNS and TLS)" "$STARTUP_SWITCH" \
          "DDclient Configuration" "(Use ddclient for automatic DDNS updates)" OFF \
          "Activate TLS" "(Enable HTTPS with Let's Encrypt)" "$ACTIVATE_TLS_SWITCH" \
          "SMTP Mail" "(Enable being notified by mail from your server)" OFF \
          "Static IP" "(Set static IP in Ubuntu with netplan.io)" OFF \
          "Automatic updates" "(Automatically update your server every week on Sundays)" OFF \
          "GeoBlock" "(Only allow certain countries to access your server)" OFF \
          "Disk Monitoring" "(Check for S.M.A.R.T errors on your disks)" OFF \
          "Security" "(Add extra security based on this http://goo.gl/gEJHi7)" OFF \
          "Daily Backup Wizard" "([BETA] Create a Daily Backup script)" OFF 3>&1 1>&2 2>&3)

silversolver · May 8, 2021, 3:13pm

I have always gotten that menu and option, yes. I was referring to line 546. Perhaps that is not interactive but merely applies previously chosen options. I didn’t ever get presented with a prompt to add trusted domains in any of my installations.

As an aside, is deSEC preferable to using noIP and adding its updater manually other than being easier to configure?

szaimen · May 8, 2021, 3:17pm

This is most likely, because adding a trusted domain is part of the Activate TLS script.

If you wanna use you own domain, you should use the Activate TLS script.
Running the deSEC script will give you your own dedyn.io subdomain and then automatically configure your server with trusted domain and lets encrypt using that domain. One advantage is that this works without opening any ports to the public internet.

silversolver · May 8, 2021, 6:29pm

That is slick and very good to know! Thank you.

Based on the discussions we’ve been having it sounds like I should just truncate the setup script and (re)run the end to finish the job. I’ll try it on my actual VM first (where I can easily do a state backup via the files on the host lol) and see how it goes before I try it on the physical appliance.

Thank you so much for helping me work through this.

enoch85 · May 8, 2021, 9:00pm

No, the setup script can only be run once. So as I mentioned before - start over from scratch.

szaimen · May 8, 2021, 10:23pm

If he knows that he has seen the line 546, he can definitely create a truncated script from the startup script starting there…

silversolver · May 9, 2021, 12:41am

Thanks @szaimen ! I definitely wasn’t going to start from scratch on 2 working instances if I could avoid it. I couldn’t see anything after that line which would be likely to break anything even if it had already run, so you’ve confirmed my suspicions that it should be OK. Now to try it and report how it goes.

silversolver · May 9, 2021, 1:23am

OK, I ran the truncated script to make it finish on my VM appliance, and now everything seems normal. If all seems good over the weekend I’ll do the same thing on the physical appliance.

I’m still a little baffled as to why three different instances had the same failure, but at least now I have what looks like a good solution. I have to wonder how many others have had the same experience but are just living with the problem. I still see no reason why it would have failed to complete the first time though.

enoch85 · May 9, 2021, 7:12am

Are you sure?

I’d say line 553

github.com

nextcloud/vm/blob/master/nextcloud-startup-script.sh#L553


apt autoclean


# Set trusted domain in config.php
run_script NETWORK trusted


# Remove preference for IPv4
rm -f /etc/apt/apt.conf.d/99force-ipv4 
apt update


# Success!
msg_box "The installation process is *almost* done.


Please hit OK in all the following prompts and let the server reboot to complete the installation process."


msg_box "TIPS & TRICKS:
1. Publish your server online: https://goo.gl/iUGE2U
2. To login to PostgreSQL just type: sudo -u postgres psql nextcloud_db
3. To update this server just type: sudo bash /var/scripts/update.sh
4. Install apps, configure Nextcloud, and server: sudo bash $SCRIPTS/menu.sh"


msg_box "SUPPORT:

If you saw the text from that line you could just remove this manually:

github.com

nextcloud/vm/blob/04ec101f4b85fed16f231af8d75587447c76bc5a/nextcloud-startup-script.sh#L586-587



LOGIN:
Login to Nextcloud in your browser:
- IP: $ADDRESS
- Hostname: $(hostname -f)

### PLEASE HIT OK TO REBOOT ###"

# Reboot
print_text_in_color "$IGreen" "Installation done, system will now reboot..."
check_command rm -f "$SCRIPTS/you-can-not-run-the-startup-script-several-times"
check_command rm -f "$SCRIPTS/nextcloud-startup-script.sh"
reboot

silversolver · May 9, 2021, 12:42pm

That is the puzzling thing; I don’t remember seeing that message on any of the instances I configured. Of course it’s possible I did and don’t remember, as this kind of FYI for the non-technical is stuff I gloss, but I think I would have remembered seeing it at least once out of the three times I ran this. I do wonder how that ending looked in the NC19 appliance, as the first one I did was that. There were some changes that I noticed; all nice improvements. The NC19 one may have been the testing one from nextcloud.com, whereas the NC21 was the production one from Hannson IT.

I think I’m going to see if this behavior is reproducible again by importing it one more time and see if it goes all the way to the end on the 4th try or stops short, just for science. I’m satisfied that nothing is broken in my running instances and don’t feel the need to mess with them beyond what I did already.