• 0 Posts
  • 3 Comments
Joined 3 years ago
cake
Cake day: June 28th, 2023

help-circle
  • if you have an uplink of 1 Gbit/s or less, you can easily solve the problem of ports by purchasing a switch for $3. By the way, there is a mini PC with 4/6/8 ports and even with optical fiber.

    and in general, if topic starter build your own server, just build a router out of it too. the set of programs is not very large: kea-dhcp, radvd, iptables. that’s all. for WiFi, you will need a compatible card in the server or a separate access point like ubiquity.


  • Yes, it is. But I have llama-swap, openweb-ui. If you spend some time on the llama-swap configuration, then the you have a good chance to run the model on 2 cards is through llama.cpp. The winnings, however, will not be x2 of course and will fall non-linearly from the number of cards. And you need motherboard with good PCI-E lines (2 pci-e x16 or more). But it’s still cheaper than one large card. Example:

    HIP_VISIBLE_DEVICES=0,1 \
    /opt/llama.cpp/build/bin/llama-server \
      --host 127.0.0.1 \
      --port 8082 \
      --model /storage/models/model.gguf \
      --n-gpu-layers all \
      --split-mode layer \
      --tensor-split 1,1 \
      --ctx-size 32768 \
      --batch-size 512 \
      --ubatch-size 512 \
      --flash-attn on \
      --parallel 1
    

    There is a less stable but more productive one: --split-mode row

    P.S. By the way, one RX9070XT on my instance translates posts and comments. You can test it if you want. =)