Skip to content

Git Troubleshooting & Tips

Some reasons, why we now sometimes run into git issues

  • Over the past years we asked users (e.g. during trainings) to install git with minium required guidance
  • Even though more advanced tools (ARCitect) now bring their own git installation, there might still occur interferences with older installations
  • There might also be issues of tools (e.g. ARCitect and ARC commander) or different versions of those tools handling git-related tasks a bit differently or more / less strict (e.g. things like main as the default branch)
  • The current (versions of) tools were not really built for collaboration with many people on one ARC (at least not with default settings from DataHUB side). So common errors are related to merge conflicts (multiple users changing files) and divergent branches (e.g. between local and remote clones of the ARC).
  • Some behaviors are simply very use-case or setup specific and will in any case and even with the best tooling require some stewardship
  1. (if required) Install Git on user machine

  2. navigate to the ARC in trouble (via one of many options below)

  • On macOS: you can drag&drop the ARC folder from Finder into a terminal
  • On macOS: right-click ARC folder—>“Services”—> “New Terminal at Folder”
  • On windows: open folder via Explorer; type “cmd” or “powershell” into the address field on top of Explorer
  • On linux / macOS terminal: cd path/to/ARC
  • From inside ARCitect: Tools -> Command Window
  1. try some of the git commands and debugging below
error message*possible reasonpossible solution
remote: HTTP Basic: Access denied fatal: Authentication failed for 'https://git.nfdi4plants.org/UserName/ARCname'Your computer is not “linked” to your DataHUB accountAccess Denied
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCname' hint: Your push was rejected due to missing or corrupt local objects.You tried to upload LFS-tracked files that are not present on your computerMissing LFS Objects
remote: GitLab: LFS objects are missing. Ensure LFS is properly set up or try a manual "git lfs push --all"You tried to upload LFS-tracked files that are not present on your computerMissing LFS Objects
LFS: PUT "<https://git.nfdi4plants.org/.../...>" read tcp ... i/o timeoutYou ran into a time out, likely due to very large single filesPrevent LFS time out error
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCname' hint: Updates were rejected because the remote contains work that you do not have locally.Your local ARC is out of sync with the remote.ARC not in sync with the DataHUB
ERROR: Can not sync with remote as no remote repository address was specified.There is no URL specified for your ARC’s remoteGit remote
ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git' not foundThe remote URL does not existGit remote
ERROR: GIT: fatal: detected dubious ownershipThis is an error typically seen when working on mounted network drivesDubious ownership
fatal: credential-cache unavailable; no unix socket supportLikely happens on Windows, if a gitconfig contains credential.helper=cacheAdjust the Git Credential helper setting
fatal: Need to specify how to reconcile divergent branches.Your ARC contains multiple branches that progressed independently and need to be mergedContact a data steward.
error: unable to create file <path/to/file> : Filename too longLikely occurs on Windows, if your ARC or files in your ARC are stored in a deeply nested folder, i.e. a folder in a folder in a folder …Allow very long file names

Your two favorite Git commands: status and log

Section titled Your two favorite Git commands: status and log

Whenever your asked for ARC support likely related to a git issue, the first thing you want to explore is the state of the ARC.

To get a good summary of the ARC including

  • the branch you are on
  • files that were committed since last commit
  • files that were modified, but not committed (tracked)
  • typically anything buggy
Terminal window
git status

If everything’s clear and committed, this should prompt something like

Your branch is up to date with … nothing to commit, working tree clean

Now, to compare the status of the local clone vs. that of the remote (i.e. the DataHUB) with a bit more confidence and wording, use

Terminal window
git log

This displays the commit history (messages) of the ARC reverse-chronologically, i.e. top-most = latest. So if the top commit message of the local ARC is different from the last commit message displayed in the DataHUB, the ARC is out of sync.

If you like it prettier, remember “a dog”…

Terminal window
git log --all --decorate --oneline --graph

Hit qto close the log.

The gitconfig is basically the settings and preferences for your git installation. There are three types of gitconfigs. Depending on the tool (ARCitect, ARC Commander) and operating system (macOS, Linux, Windows), different git settings may be received from different config files.

flagmeaning
—globalcurrent user on that computer
—systemsystem-wide (all users)
—localcurrent repository (ARC)

The following command lists all configurations and where they originate (—show-origin) from and what there scope is (—show-scope).

Terminal window
git config --list --show-origin --show-scope

In order to only show e.g. the global gitconfig use

Terminal window
git config --global --list

Typical settings to explore and trouble-shoot

  • the default branch should be: init.defaultbranch=main
  • user.name and user.email should be defined
  • if users keep being asked for passwords during sync with the DataHUB, they might not store their credentials via a credential.helper.

Editing the respective gitconfig is ideally done via command line (quick internet search helps).

Terminal window
git config --global user.name "Your Name"
git config --global user.email "Your eMail"
Terminal window
git config --global init.defaultBranch main

The gitconfig contains a setting, whether and how to save git credentials on your machine called credential.helper.

On Windows, you might run into the error fatal: credential-cache unavailable; no unix socket support, if it is set to credential.helper=cache.

This can be solved by either of the following:

  1. Remove “credential.helper=cache” via git config --global --unset credential.helper.
  2. Overwrite the setting with “store” instead of “cache” via git config --global credential.helper store.

Users (especially on windows) run into errors with long overall file names (i.e. full path). This setting should fix it:

Terminal window
git config --global core.longpaths true

For ARCs the “remote” is the DataHUB. The remote address (ARC url) is stored in the git of the local ARC. Display the URL, to which the local ARC is connected via

Terminal window
git remote -v

Adding a remote during arc sync

Section titled Adding a remote during arc sync

A default remote is usually added by ARC Commander or ARCitect. If the ARC does not yet exist in the DataHUB, and you created it via ARC Commander and synced it via arc sync, you will see this error:

Terminal window
ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git/' not found
GIT: warning: redirecting to https://git.nfdi4plants.org/UserName/ARCname.git/
...
GIT: remote: The private project UserName/ARCname was successfully created.

This is not to worry about, the ARC was created in the DataHUB during this process.

If you only see the error ERROR: GIT: fatal: repository 'https://git.nfdi4plants.org/UserName/ARCname.git/' not found, but not the following lines mentioning that the ARC was created automatically, make sure to use the “force”, i.e. arc sync --force ....

If above command does not display any remote, you can add one via

Terminal window
git remote add origin https://git.nfdi4plants.org/<UserName>/<ARCName>

You can edit a remote via

Terminal window
git remote set-url origin https://git.nfdi4plants.org/<UserName>/<ARCName>

As of now, the DataPLANT tools focus on working on a single branch (main). It can still happen that your ARC has multiple branches e.g. by accident (see git config —> init.defaultbranch) or because some git-affine collaborator knows how to create them. To display the branches of the local ARC, use

Terminal window
git branch

If you also want to display branches that exist on the remote (but not locally), use

Terminal window
git branch --all

Git LFS is basically the system in the back to simplify working with git and (ARCs containing) large data files. ARC commander and ARCitect offer options to download (clone) an ARC without large files; speeding up the process and avoiding waste of data storage, if you are only interested e.g. in the metadata.

In order to properly upload large(r) files to the DataHUB via “pure git” (i.e. on the command line) or via ARC Commander or ARCitect, Git-LFS needs to be initiated on every computer (and user account) before using these tools.

Checking whether LFS (large file storage) works properly for your ARCs

Section titled Checking whether LFS (large file storage) works properly for your ARCs
  • In ARCitect, you can see large files (defined by the threshold in the commit menu) flagged as LFS in the file tree
  • In the DataHUB LFS files are also flagged as LFS. In addition, you can click in the right sidebar of your ARC in the DataHUB on “Project Storage”. Here, the major amount of your data should be stored in “LFS”, while only a minor part is stored in “Repository”.
  • If you have git-lfs installed and know how to use there command line, simply run git lfs install.
  • You can check for the proper configuration via git config --list --show-origin --show-scope. Amongst others, the config should contain the following lines
filter.lfs.process=git-lfs filter-process
filter.lfs.required=true
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f

In your home folder (Windows: C:/Users/<UserName>, macOS: Users/<UserName>), create or edit the file called .gitconfig to include the following lines.

[filter "lfs"]
process = git-lfs filter-process
required = true
clean = git-lfs clean -- %f
smudge = git-lfs smudge -- %f

When users try to upload very large files, i.e. not the overall push size, but single-very-large-files, they might run into a time out error. This setting should fix it:

Terminal window
git config lfs.activitytimeout 0

The following errors are related to missing LFS object:

Terminal window
hint: Your push was rejected due to missing or corrupt local objects.
error: failed to push some refs to 'https://git.nfdi4plants.org/UserName/ARCName.git'
Terminal window
remote: GitLab: LFS objects are missing. Ensure LFS is properly set up or try a manual "git lfs push --all".

Possible reasons, why this happens:

  • you have downloaded (cloned) an ARC without the large files (i.e. only the pointer files) and try to upload it to another location on the DataHUB (i.e. new remote due to a transfer to other user, group, etc. or renamed ARC)
  • you moved a pointer file (instead of an actual large file) from one ARC on your computer to another ARC and tried to upload

In this case you would have to download all LFS objects from the original remote first -> ask a data steward for help.

Step-by-step track large file(s) via LFS

Section titled Step-by-step track large file(s) via LFS

Done in small steps plus logging. Note this works on shells like macOS terminal, linux terminal, Git Bash (available for Windows). This likely does not work on Windows Powershell and definitely not in Windows command prompt.

  1. Track files via LFS (this adds them to .gitattributes)

    Terminal window
    git lfs track "assays/RNAseq_RawData/dataset/**"
  2. git track the .gitattributes file first

    Terminal window
    git add .gitattributes
  3. Git add the large files

    Terminal window
    git add assays/RNAseq_RawData/dataset/*
  4. Git commit (and write what’s happening to a log file)

    Terminal window
    GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git commit -m "add rnaseq files to LFS" -v >> git-commit-LFS.log 2>&1 &
  5. Git push (and write what’s happening to a log file)

    Terminal window
    GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git push -v >> git-push-LFS.log 2>&1 &

Check the status of LFS-tracked files

Section titled Check the status of LFS-tracked files
Terminal window
git lfs status

To get a list of LFS-tracked files including the size of the original file, run

Terminal window
git lfs ls-files -ls

This will display the object ID (oid), the relative path to the file and the object size. The oid is also stored in the pointer file at the file’s position.

To get a report of all LFS-tracked files including there status, use

Terminal window
git lfs ls-files -d

Amongst others, this report will print for every LFS file, whether it is downloaded (checkout: true; download: true) to the local ARC or not (checkout: false; download: false).

Common issues and error messages

Section titled Common issues and error messages

ARC files opened in multiple programs

Section titled ARC files opened in multiple programs

A common source for issues are multiple programs that work on the ARC in parallel.

  • In particular, working on the ARC with multiple softwares that have Git integration may lead to confusion. For instance, while you sync the ARC using ARCitect or ARC Commander, the changes may still be displayed as un-committed in VSCode, RStudio, PyCharm or other third-party software.

  • Many softwares produce hidden temporary files. By default these files are not shown or synced by the ARCitect or ARC Commander. They might still sometimes lead to confusion, e.g. not being able to commit changes. This is especially the case for office software (Excel, Word, LibreOffice, etc.), where e.g. one of the ISA files (isa.investigation.xlsx, isa.study.xlsx, isa.assay.xlsx) or another office file stored in the ARC may be open. However, also ARCs opened in Windows Explorer or macOS Finder sometimes led to issues.

  • Before syncing an ARC, close all ARC-files and Explorer / Finder windows

  • Avoid to edit, delete, or move files, while the ARC is being synced to the DataHUB

ARC not in sync with the DataHUB

Section titled ARC not in sync with the DataHUB

Your local ARC is likely out of sync with the remote. This happens, if you or an invited colleague work(s) on the same ARC from a different location (e.g. the DataHUB or another computer). Before working on your ARC, make sure to update the local clone via one of these

  • ARCitect —> Versioning —> Pull
  • arc sync
  • git pull (-> this would also prompt a message if changes need to be merged)

Sometimes you run into permission issues such as

Terminal window
remote: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password.
Terminal window
fatal: Authentication failed for 'https://git.nfdi4plants.org/UserName/ARCName.git/'

This is due to missing or outdated DataHUB credentials on your computer. It usually helps to just retrieve new ones. If not, you might have to remove existing credentials stored on your computer.

Option 1: via ARC Commander

Option 2: “by hand”

  1. Login to the DataHUB
  2. Create a new Personal Access Token (PAT) with scope api
  3. Run a git command (e.g. arc sync, git pull) to trigger being asked for git credentials
    1. Provide your DataHUB username
    2. Use the token instead of your password

If (new) authentication alone does not help, you might need to delete existing tokens or passwords first.

  1. Run git config --get-regexp "credential" to find out whether and where credentials are stored

  2. This typically displays one of the following

    credential.helper store

    credential.helper osxkeychain (only on macOS)

  3. If credential.helper store is displayed, the credentials are typically stored in ~/.git-credentials, a hidden text file stored in the user’s home folder. Edit this file and delete the row(s) containing “git.nfdi4plants.org” (https://<UserName>:<Token>@git.nfdi4plants.org).

  4. On macOS (if credential.helper osxkeychain is displayed) open the app “Keychain Access”, search and delete passwords for “git.nfdi4plants.org”.

The error ERROR: GIT: fatal: detected dubious ownership typically occurs when working on a mounted network drive (Fileshare, File Server, NAS). Very simplified: the user on the computer and the owner of the network drive differ and git tries to safe you from working in a folder you do not own.

You can add the path to the ARC to the list of safe directories via the command

Terminal window
git config --global --add safe.directory %(prefix)///servername/share/path/to/ARC/

You can circumvent this error by adding all directories to your list of safe directories via the command

Terminal window
git config --global --add safe.directory *

To help troubleshooting add (some or all) variables GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 before your git command to get more info, e.g.

Terminal window
GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=1 git push -v >> git-push-LFS.log 2>&1 &